2. Bigdata Intelligence PlatformBICube 2
๋ชฉ์ฐจ
I. Paradigm Shift
II. Machine Learning
III. Neural Stream
IV. FinTech
V. Fraud Detection System
VI. Conclusions
3. Bigdata Intelligence PlatformBICube 3
๋ชฉ์ฐจ
I. Paradigm Shift
II. Machine Learning
III. Neural Stream
IV. FinTech
V. Fraud Detection System
VI. Conclusions
9. Bigdata Intelligence PlatformBICube 9
๋ชฉ์ฐจ
I. Paradigm Shift
II. Machine Learning
III. Neural Stream
IV. FinTech
V. Fraud Detection System
VI. Conclusions
10. Bigdata Intelligence PlatformBICube 10
II. Machine Learning
Data๋ก ๋ถํฐ ์ถ๋ฐ....
โข ๊ธฐ๊ณ(Machine) + Learning (ํ์ต)
โข ๊ธฐ๊ณ(์ปดํจํฐ)์๊ฒ ๋ฐ์ดํฐ๋ฅผ ์ด์ฉํ์ฌ ํ์ตํ๋ ๋ฐฉ๋ฒ์
๊ฐ๋ฅด์น๋ ๊ฒ.
Teach computer how to learn from data
๋ฐ๋ผ์ Data๊ฐ ๊ต์ฌ์ด๋ค.
14. Bigdata Intelligence PlatformBICube 14
ML Modeling
ML Deploy
ML Optimizer
New Data
Decision Making
Alert
ML Lifecycle
Anomaly Store
Hadoop DFS/NoSQl/Hive
II. Machine Learning
19. Bigdata Intelligence PlatformBICube 19
๋ชฉ์ฐจ
I. Paradigm Shift
II. Machine Learning
III. Neural Stream
IV. FinTech
V. Fraud Detection System
VI. Conclusions
30. Bigdata Intelligence PlatformBICube 30
I. Paradigm Shift
II. Machine Learning
III. Neural Stream
IV. FinTech
V. Fraud Detection System
VI. Conclusions
๋ชฉ์ฐจ
45. Bigdata Intelligence PlatformBICube 45
IV. FinTech
The Kreditech Group uses big data, complex algorithms and automated
workflows to serve a simple mission: โBetter banking for everyoneโ. Based
on 20,000 dynamic data points, the unique technology is capable of scoring
everyone worldwide, including the 4bn individuals without credit score.
Deploying the technology makes physical contact and paper exchange
redundant. Funds can be paid out within seconds to a credit card, bank
account or NFC wallet, 24/7.
48. Bigdata Intelligence PlatformBICube 48
I. Paradigm Shift
II. Machine Learning
III. Neural Stream
IV. FinTech
V. Fraud Detection System
VI. Conclusions
๋ชฉ์ฐจ
69. Bigdata Intelligence PlatformBICube 69
๋ชฉ์ฐจ
I. Paradigm Shift
II. Machine Learning
III. Neural Stream
IV. FinTech
V. Fraud Detection System
VI. Conclusions
71. Bigdata Intelligence PlatformBICube 71
Classical rule-based approach
โข Always โtoo lateโ:
โข New fraud pattern is โinventedโ by criminals
โข Cardholders lose money and complain
โข Banks investigate complains and try to understand the new pattern
โข A new rule is implemented a few weeks later
โข Expensive to build (knowledge intensive)
โข Difficult to maintain:
โข Many rules
โข The situation is dynamically changing, so frequently
โข rules have to be added, modified, or removed โฆ
VI. Conclusions
72. Bigdata Intelligence PlatformBICube 72
A perfect fraud detection system:
โข โTunedโ to every cardholder or bank account:each cardholder or
bank account treated individually
โข Adaptive:evolve with slow/small changes in cardholder behavior
โข Fast (real-time)
โข High accuracy
A system based on profiles
โข Every cardholder gets a vector of parameters that describe his/her
behavior: an โaverage-behaviorโ profile
โข The system constantly compares this โlong-termโ profile with the
recent behavior of cardholder
โข Transactions that do not fit into cardholderโs profile are flagged as
suspicious (or are blocked)
โข Profiles are updated with every single transaction, so the system
constantly adopts to (slow and small) changes in cardholdersโ
behavior
VI. Conclusions
73. Bigdata Intelligence PlatformBICube 73
Challenge: real-time detection!
โข Monitor in real time all POS/ATM transactions
โข Detect unusual patterns and block compromised cards as quickly
as possible
โข Ideally: block compromised cards before fraud is discovered!
โข A big question: can we do it ???
โข Some numbers:
โข 3,000,000,000 transactions per year
โข up to 15,000,000 transactions per day
โข up to 400 transactions per second (peak hours)
โข 100,000,000 cards
VI. Conclusions
74. Bigdata Intelligence PlatformBICube 74
Speed is the key !!!
โข Maintain a sliding buffer of the last billion transactions in RAM
(fast memory)
โข Organize the transactions in such a way that some queries could be
executed very fast
โข Develop some clever algorithms that operate on this data structure
โข Will it work??? Yes, it will !!! Yes, it does โฆ
โข many transactions - billions - algorithms must be efficient
โข mixed variable types (generally not text, image)
โข large number of variables
โข incomprehensible variables, irrelevant variables
โข different misclassification costs
โข many ways of committing fraud
โข unbalanced class sizes (c. 0.1% transactions fraudulent)
โข delay in labelling
โข mislabelled classes
โข random transaction arrival times
โข (reactive) population drift
VI. Conclusions