Pulsar Function is a succinct computing abstraction Apache Pulsar provides users to express simple ETL and streaming tasks. The simplicity comes in two folds: Simple Interface and Simple Deployment. As it has been adopted, we realized that the native support of organizing multiple functions into integrity will be very beneficial. With such support, people can express and manage multi-stage jobs easily. In addition, this support also provides the possibility of higher-level abstraction DSL to further simplify the job composition. We call this new feature -- Function Mesh.
This talk aims to provide a thorough walkthrough of this new Function Mesh Feature, including its design, implementation, use cases, and examples, to help people seeking simple streaming solutions understand this newly created powerful tool in Apache Pulsar.
Function Mesh: Complex Streaming Jobs Made Simple - Pulsar Summit NA 2021
1. Pulsar Virtual Summit North America 2021
Function Mesh
Complex Streaming Jobs In A Simple Way
2. Pulsar Virtual Summit North America 2021
Neng Lu is a software engineer at StreamNative
where he drives the development of Apache
Pulsar and the integrations with big data
ecosystem. Before that, he was a senior software
engineer at Twitter focusing on streaming
processing engine development. Before joining
Twitter, he got his master's degree from UCLA and
a bachelor's degree from Zhejiang University.
Rui Fu is a software engineer at StreamNative.
Before joining StreamNative, he was a platform
engineer at the Energy Internet Research Institute
of Tsinghua University. He was leading and
focused on stream data processing and IoT
platform development at Energy Internet
Research Institute. Rui received his postgraduate
degree from HKUST and an undergraduate
degree from The University of Sheffield.
3. Pulsar Virtual Summit North America 2021
StreamNative
Founded by the creators of Apache Pulsar, StreamNative provides a
cloud-native, unified messaging and streaming platform powered by
Apache Pulsar to support multi-cloud and hybrid-cloud strategies
4. Pulsar Virtual Summit North America 2021
Agenda
I. Pulsar Function
II. Function Mesh
III. Demo
6. Pulsar Virtual Summit North America 2021
Pulsar Functions
Pulsar Functions are lightweight compute processes that:
● consume messages from Pulsar topics
● apply a user-supplied processing logic to each message
● publish results to another Pulsar topic
7. Pulsar Virtual Summit North America 2021
Pulsar Functions
Use Case
● ETL Jobs
● Real-time Aggregation
● Microservices
● Reactive Services
● Event Routing
8. Pulsar Virtual Summit North America 2021
Pulsar Functions
IS & IS NOT
Pulsar Functions IS:
● Lambda-style processing unit that are specifically
designed to integrate with Pulsar
Pulsar Functions IS NOT:
● Another Full-Power Streaming Processing Engine
10. Pulsar Virtual Summit North America 2021
Pulsar Connectors
Pulsar Connectors are special form of Pulsar Functions that:
● Include two types: source and sink
● Enable user to easily integrate with external data systems
○ Source: feed data from external system into Pulsar
○ Sink: send data from Pulsar to external system
12. Pulsar Virtual Summit North America 2021
Pulsar Functions
API
Hello
Hello!
Pulsar
Function
13. Pulsar Virtual Summit North America 2021
Pulsar Functions
Semantics
● ATMOST_ONCE
○ Message is ACKed to Pulsar once received
● ATLEAST_ONCE
○ Message is ACKed to Pulsar after the function completes -- Default
● EFFECTIVELY_ONCE
○ Utilizes Pulsar’s Effectively Once Semantics
● EXACTLY_ONCE [TO_BE_ADDED]
○ Txn from Pulsar 2.8.0
14. Pulsar Virtual Summit North America 2021
Pulsar Functions
Runtime
● Thread: Invoke functions threads in functions worker.
● Process: Invoke functions in processes forked by
functions worker.
● Kubernetes: Submit functions as Kubernetes
StatefulSets by functions worker.
19. Pulsar Virtual Summit North America 2021
Function Mesh
IS & IS NOT
Function Mesh IS a Kubernetes Framework for:
● Integrating separate functions together to process data
● Utilizing Kubernetes native resources and scheduling
capability
Function Mesh IS NOT full power Streaming Engines
20. Pulsar Virtual Summit North America 2021
• Inconsistency in Pulsar functions’ Kubernetes runtime
• Metadata topics may cause Broker crash loop
• Cannot use functions across Pulsar clusters
• Hard to use Kubernetes features
• A lot of manual works to do complex jobs
Function Mesh
Why
21. Pulsar Virtual Summit North America 2021
● Kubernetes Operator
● Function Runner
● Mesh Worker Service
Function Mesh
Internals
22. Pulsar Virtual Summit North America 2021
● Function
● Source
● Sink
Function Mesh
Custom Resource Definition
23. Pulsar Virtual Summit North America 2021
● FunctionMesh (a.k.a Mesh)
Function Mesh
Custom Resource Definition
29. Pulsar Virtual Summit North America 2021
● Kubernetes Horizontal Pod
Autoscaler (HPA)
● maxReplicas
● Disabled by default
Function Mesh
Auto-Scaling
30. Pulsar Virtual Summit North America 2021
● Kubernetes Operator
● Function Runner
● Mesh Worker Service
Function Mesh
Internals
31. Pulsar Virtual Summit North America 2021
● Support pulsar-admin APIs
● Work with Pulsar 2.8.0
● Function/Sink/Source
● Customizable built-in connectors
Function Mesh
Mesh Worker Service
32. Pulsar Virtual Summit North America 2021
● Eases the management
● Utilizes the full power of Kubernetes Scheduler
● Function as a First Class Citizen in Cloud Environment
● Open the potential talking to different messaging system
Function Mesh
Summary
33. Pulsar Virtual Summit North America 2021
● Improve the capability level of the Function Mesh operator.
● Feature parity with Pulsar Functions.
● Support additional runtime based on self-contained function
runtime, such as web-assembly.
● Develop better tools/frontend to manage and inspect Function
Meshes.
● Group individual functions together to improve latency and
reduce cost.
● Support advanced auto-scaling based on Pulsar metrics.
Function Mesh
Future
34. Pulsar Virtual Summit North America 2021
● Website
○ https://functionmesh.io/
● Open Sourced
○ https://github.com/streamnative/function-mesh
● One-key installation script
● Helm charts
Get Function Mesh
38. Pulsar Virtual Summit North America 2021
We’re hiring
Build Pulsar with the team that builds Pulsar
✓ Work with the creators of Pulsar
✓ Exciting, growth-stage company
✓ Open and collaborative environment
✓ Competitive compensation and
benefits
✓ Best teammates on earth
https://streamnative.io/careers
40. Pulsar Virtual Summit North America 2021
StreamNative Cloud
Fully-managed Apache Pulsar-as-a-Service
● Massive scale without the ops overhead
● Built for hybrid and multi-cloud
○ Cloud-Hosted & Cloud-Managed
● Stream across public clouds for
multi-cloud applications
● Elastic, consumption-based pricing with
‘pay as you go’ model
● Reliably scale mission-critical apps
https://streamnative.io/cloud
41. Pulsar Virtual Summit North America 2021
StreamNative Platform
Self-managed enterprise offering of Pulsar
✓ Kafka-on-Pulsar
✓ Function Mesh for serverless streaming
✓ Enterprise-ready security
✓ Pulsar Operators
✓ Seamless StreamNative Cloud
experience
https://streamnative.io/platform