Summarized using AI

Lightning Talk: Spinning up micro-services using Ruby/Kafka

Ankita Gupta • June 23, 2017 • Singapore • Lightning Talk

In this lightning talk presented at the Red Dot Ruby Conference 2017, Ankita Gupta, a software engineer at honestbee, discusses the complexities of managing large monolithic applications and explores the advantages of transitioning to micro-services architecture using Ruby and Kafka. As applications grow, deployment and testing cycles become increasingly cumbersome, making micro-services an appealing solution to these issues. However, this shift introduces new challenges that need careful consideration.

Key Points Discussed:
- Understanding Monolithic Applications: As applications expand, components within them can become unwieldy, causing delays and complications in deployment and testing.
- Benefits of Micro-Services: Micro-services allow for the separation of concerns and enable asynchronous communication through methods like Apache Kafka, which facilitates an event-driven architecture.
- Communication Between Services: Gupta illustrates the difference between using direct API calls and a message bus. The latter allows services to operate independently, thus reducing coupling and improving system reliability.
- Using Kafka as a Message Bus: Kafka is highlighted for its scalability and ability to handle high throughputs among distributed systems. Key concepts such as producers, consumers, and topics are explained to showcase how Kafka can effectively manage messages between multiple services.
- Practical Example: The talk includes a use case where a user creation process is coupled with sending welcome messages. A producer in the Rails application is used to publish messages when users are created or updated, using callbacks in models to reduce code duplication.
- Consumer Logic: Gupta discusses using libraries like Karateka to consume messages from Kafka, demonstrating how to configure controllers to handle events cleanly.
- Real-Time Demonstration: Throughout her talk, Gupta performs a live demo where a signup process triggers messages sent via Kafka, illustrating the practical application of the discussed concepts.

Conclusions: Transitioning to micro-services can enhance developer productivity and application maintainability. While there are deployment challenges to consider, the developer perspective is important in ensuring a smooth shift to a micro-services architecture. Gupta emphasizes focusing on development while leaving deployment intricacies to DevOps teams.

In summary, adopting a micro-services architecture with tools like Kafka can streamline application development, making it easier for organizations to adapt to growing demands and complexities in software management.

Lightning Talk: Spinning up micro-services using Ruby/Kafka
Ankita Gupta • Singapore • Lightning Talk

Date: June 23, 2017
Published: unknown
Announced: unknown

Speaker: Ankita Gupta, Software Engineer, honestbee

As organisations get bigger, handling large application(s) gets harder - long release and test cycles, higher chances of a small change affecting other parts of the system. Micro-services solve some of these problems, albeit with their own set of challenges. Apache Kafka allows setting up event-driven architectures, wherein the concern of each service can be cleanly separated, and communication among services can happen asynchronously. The transition form a large rails application to smaller applications can be made more seamless with a few easy steps. I will be elaborating steps developers can take to make this process easier.

Speaker's Bio

Ankita is working as a full-stack software engineer at honestbee. In her free time, she works on her non-profit project Jugnuu, a low-cost, mobile-based English language solution for children.

Event Page: http://www.reddotrubyconf.com/

Produced by Engineers.SG

Red Dot Ruby Conference 2017

00:00:04.080 good afternoon everyone um so I work at
00:00:08.410 honest B and my talk flows in very well
00:00:10.809 with what Florian just discussed about
00:00:12.549 monolithic applications and oftentimes
00:00:15.010 what you find is that as your
00:00:17.320 application size grows bigger and the
00:00:19.600 business requirements keep flowing in
00:00:21.250 certain parts of the application you
00:00:24.550 know they just get bulky and it becomes
00:00:26.650 harder to change them for various
00:00:28.960 reasons I mean a few if you don't have a
00:00:30.700 sophisticated system like Shopify does
00:00:32.619 and you know sometimes you have to wait
00:00:33.879 longer for builds to run deployments
00:00:36.070 become slower test me to be either you
00:00:39.280 have to have a powerful set of test
00:00:40.660 suite and if you don't if you don't feel
00:00:42.460 confident about certain integration
00:00:43.960 tests you know surely you have to have
00:00:46.300 put in QA efforts and so on but
00:00:49.149 obviously splitting up applications is
00:00:50.980 not something it comes with additional
00:00:53.260 responsibilities and there are certain
00:00:55.410 cons you know or rather certain
00:00:58.329 additional like headaches that it brings
00:01:00.309 along so I don't want to get into
00:01:02.530 whether you should or you shouldn't
00:01:04.509 split it up because I think that's
00:01:06.399 another topic altogether but should you
00:01:08.770 choose to do it what are some of the
00:01:11.380 good ways or what are some of the ways
00:01:13.060 to think about it especially from a
00:01:14.709 development point of view are not
00:01:16.869 thinking about the deployment challenges
00:01:18.429 involved in this process so I first like
00:01:21.850 to show you this diagram and let's say
00:01:24.639 you decide to split services into you
00:01:27.189 know different you take a big service
00:01:29.799 and you start taking out some of the you
00:01:31.389 know some of the components of the
00:01:32.619 service out there's various ways for
00:01:35.409 these services to actually talk to each
00:01:37.179 other and one of the common ways is HTTP
00:01:39.249 now one of the issues with services
00:01:42.849 talking via API is is that every single
00:01:46.270 service is aware of the concern of the
00:01:48.969 other service so let me give an example
00:01:50.700 let's say you have a service that
00:01:52.749 creates users and you also want to do
00:01:55.899 this additional notification you know
00:01:57.880 logic where every time a user is created
00:01:59.979 maybe you want to send a notification
00:02:01.630 for verification of the user or send out
00:02:04.329 welcome emails and so on and
00:02:06.270 notification itself is something that
00:02:08.380 can be that is useful across your system
00:02:10.840 not just for user so maybe you think
00:02:12.460 about hey maybe I can separate out
00:02:14.500 notifications in
00:02:15.960 to a separate service now if you were
00:02:18.090 making your to services contact each
00:02:19.920 other why ap is you know the user
00:02:22.770 creation service now has a contract that
00:02:25.260 it needs to follow and it needs to know
00:02:27.150 that at some point when the users
00:02:28.770 created successfully it has to send out
00:02:31.170 you know an API call to another service
00:02:33.740 on the right you have a slightly
00:02:37.110 different approach to this and you may
00:02:39.390 choose to use one or the other it's not
00:02:41.760 a complete black or white situation but
00:02:43.920 the benefit of the second approach for
00:02:46.200 at least the example that I just spoke
00:02:47.610 about is that your user creation service
00:02:50.880 only cares about creating a user and
00:02:53.250 once it's done with that it pushes that
00:02:55.440 it went out into a message bus and the
00:02:57.810 services that actually care about
00:02:59.190 anything you would use of creation
00:03:01.260 decide to then pick it up pick up that
00:03:03.840 event and do certain things like sending
00:03:05.850 out notifications and so on so in a way
00:03:08.160 you are reducing the coupling between
00:03:09.510 your services and you're also separating
00:03:11.790 the concerns of different services quite
00:03:14.100 clearly when you use such a system so
00:03:17.160 fundamentally it's this philosophy or
00:03:19.920 fire-and-forget whereas the user
00:03:21.780 creation service I fire a message and
00:03:24.390 then I don't really care what happens
00:03:25.890 and then another service picks it up and
00:03:27.870 it processes it and once it's done
00:03:29.790 processing it maybe prints an
00:03:31.320 acknowledgement back and you know if a
00:03:33.150 service cares enough about it they pick
00:03:35.100 it up and they continue with it and
00:03:36.480 that's how the flow continues now there
00:03:39.240 is various tools to do it and once again
00:03:40.980 I wouldn't want to get into the debate
00:03:42.480 because everybody's use cases and
00:03:44.790 situations may demand different tools
00:03:46.260 but let's say you choose to do it using
00:03:47.760 Kafka which is actually a highly
00:03:50.850 scalable distributed - it provides a
00:03:54.120 certain replications factor so you know
00:03:56.430 it's highly reliable and so on so being
00:03:58.830 used by a lot of organizations so for
00:04:02.910 those of you who may not have worked
00:04:04.200 with the some of the terms
00:04:05.250 just before we get into more deeper
00:04:08.670 example is a concept of a producer so
00:04:11.670 you have producers that are constantly
00:04:13.790 you know just publishing messages to the
00:04:16.410 bus and they're doing it in a particular
00:04:18.930 topic which is basically you know what
00:04:21.180 what are if it's a user creation maybe
00:04:23.580 you have a topic called user and so on
00:04:25.800 right and these topics are then divided
00:04:27.480 into partitions and the
00:04:28.920 of splitting things into partitions that
00:04:30.630 it can be it can exist across multiple
00:04:33.060 machines so it makes it highly
00:04:34.830 you know scalable and then you can also
00:04:36.600 have your consumers subscribe to certain
00:04:40.830 topics and read it simultaneously off a
00:04:43.320 particular partition so you can have
00:04:44.820 multiple consumer processes that are
00:04:46.410 running and that are simultaneously
00:04:47.910 reading it so it also gives you this
00:04:49.860 whole notion of parallel processing of
00:04:52.050 messages so that's what makes Kafka
00:04:54.060 highly superior and a really good choice
00:04:57.480 as a message bus so some of the key term
00:04:59.970 sub producers not going too much into
00:05:01.680 detail of you know all the other stuff
00:05:03.930 that happens in the background but you
00:05:05.790 have your producer consumer and your
00:05:08.010 topic and so let's talk about the simple
00:05:10.320 use case that I spoke about in the
00:05:11.820 beginning which is user creation and
00:05:13.980 maybe to sending something as simple as
00:05:15.750 like a welcome message to the users
00:05:17.970 saying hey thanks for signing up so the
00:05:21.120 first thing you want is a producer and
00:05:22.770 the producer is the one what we found
00:05:25.530 out in our rails application context is
00:05:28.500 that oftentimes the events that you want
00:05:30.900 to send out are linked to database
00:05:33.960 changes so it could be your insertion to
00:05:35.880 your database so it could be an update
00:05:37.320 your database and oftentimes these are
00:05:39.510 the events that you want to push to the
00:05:40.860 message bus there might be others but
00:05:42.720 these actually composed majority of the
00:05:44.820 events that we wanted at different
00:05:46.920 applications to respond to and so one
00:05:50.130 way to do that is to have code to send
00:05:53.130 out messages spread across your
00:05:54.870 application and that definitely brings
00:05:58.650 around up you know a bit of duplication
00:06:00.660 so then find you know you bring it in to
00:06:02.370 callbacks in the models some of things
00:06:05.040 you know when you when you allow
00:06:06.810 different developers to just add logic
00:06:09.450 to send messages to Kafka based on
00:06:12.210 certain events is that to an extent you
00:06:14.550 want to have certain contracts followed
00:06:16.260 even when you publish things you Kafka
00:06:17.760 so for example metadata right maybe you
00:06:20.550 want to have a timestamp including every
00:06:22.650 single one of your messages so where I'm
00:06:25.410 driving at is that it does make sense to
00:06:27.840 centralize the logic of sending the
00:06:29.910 message itself and adding some of the
00:06:31.590 metadata attributes the content and the
00:06:34.140 topic of the message can be configurable
00:06:36.060 but some of the other factors around it
00:06:37.920 remains like a you know it's a similar
00:06:40.380 logic across the base
00:06:42.360 so what does it look like in practice so
00:06:45.270 basically this is what we ended up doing
00:06:46.949 we created a concern that a large let me
00:06:51.419 just show it to you what it looks like
00:06:52.590 so it is basically is something that can
00:06:54.900 be included inside your model and you
00:06:57.569 can start a simple published statement
00:06:59.129 with arguments that are fundamentally
00:07:01.080 Early's because you do have situations
00:07:04.199 where you want to publish multiple times
00:07:08.189 from within the same model so for
00:07:09.810 example when the user is creation I want
00:07:12.300 to send a message to a particular topic
00:07:14.099 for user creation and I want to use an
00:07:16.229 existing serializer that decides what
00:07:19.439 the format of the message is and when
00:07:21.270 the user is updated maybe I want to only
00:07:23.610 send a message conditionally if the
00:07:26.009 phone number of the user is updated and
00:07:27.599 and at that point I want to have a
00:07:29.639 different like a message blob which is
00:07:31.439 you know coded insight based on the new
00:07:33.840 attributes like the phone number and so
00:07:36.029 on right so you can have an array of
00:07:37.919 messages that get sent out for each of
00:07:40.139 your for each of your models and that's
00:07:44.460 how you have your first you know a
00:07:45.870 simple producer where every single
00:07:48.599 creation or update can be been publish
00:07:51.270 to the message bus now as and when
00:07:53.430 you're publishing messages at some point
00:07:55.349 you do want to pick up those messages
00:07:57.539 and process them and do something with
00:07:59.580 them and that's where one of the
00:08:02.669 libraries that we discovered is karateka
00:08:04.830 which makes it really really easy to set
00:08:06.779 up a very simple and clean structure
00:08:09.210 across your application where you
00:08:10.620 basically use you create a central
00:08:13.620 configuration and you register a
00:08:15.539 controller to a topic so you can create
00:08:18.120 applications which are actually having
00:08:20.009 consumers consuming from multiple topics
00:08:23.189 or multi topic applications in a way and
00:08:25.529 once you define this mapping so you can
00:08:27.750 see that it's actually just quite simple
00:08:30.060 configuration you can define your
00:08:31.680 controller code and I mean it's
00:08:34.409 advisable to keep your controller code
00:08:36.199 lean so yeah so what I've done here in
00:08:39.599 this case is that I've just created a
00:08:40.979 separate service to send out SMS in this
00:08:42.990 case and I'm just picking up the phone
00:08:47.640 number of the user as well as the name
00:08:49.829 of the user from the message that was
00:08:52.380 sent
00:08:52.710 the message bus and that's it and that's
00:08:55.380 where your consumer logic is so I can
00:08:58.890 show you a running example I've been
00:09:01.290 advised not to do demos and short talks
00:09:03.570 because you know whatever has to go
00:09:06.030 wrong will go wrong but I'm just going
00:09:07.830 to do it anyway because I just want to
00:09:09.570 show it that it's actually really easy
00:09:11.280 at least from a development point of
00:09:12.870 view and you know the barrier to do it
00:09:15.210 is really little just because we have a
00:09:17.220 really good ecosystem he sings ubi to do
00:09:20.520 this so so this is my code that I was
00:09:24.690 just talking about and this is the this
00:09:26.970 is the application that basically has
00:09:29.490 the you know it's the same code that he
00:09:31.410 just saw on the slides so a little bit
00:09:33.270 redundant and then this is where your
00:09:35.450 controller is which is exactly the same
00:09:39.060 logic and when I go into my terminal
00:09:50.690 alright so what I have here is I have my
00:09:54.540 real server running on one end and then
00:09:56.220 I also on the right hand side in this
00:09:57.990 particular window I have my carrasco
00:10:00.900 server running so cut off by just a bit
00:10:02.310 of background is that it takes the
00:10:04.050 messages in the message bus and it
00:10:05.790 pushes it into sidekick queues and then
00:10:09.210 you have psychic workers that are
00:10:10.740 basically picking those messages up and
00:10:12.450 sending it to your controller and the
00:10:14.430 controller then processes the message so
00:10:15.990 the right hand side is the Karaka server
00:10:18.780 and the left one is the crafter worker
00:10:20.430 there's a few errors and that's all my
00:10:21.960 local environment concentration it's not
00:10:23.550 entirely normal but I haven't bothered
00:10:25.710 fixing it and then this is my signup
00:10:29.640 form right so this is a simple signup
00:10:31.410 form and what I'm going to do right now
00:10:34.200 is I'm just going to create my just add
00:10:38.040 my phone number in and then add my
00:10:40.440 honest be email and then just you know a
00:10:42.660 password and here's my phone so I'm
00:10:48.600 going to sign up and ideally I should
00:10:50.940 get a message if all goes well so there
00:10:52.770 should be some logs which basically says
00:10:54.570 that it's done processing
00:10:57.740 and just about any time now there so I
00:11:01.610 got a messages now which basically is
00:11:03.800 kinds of signing up so what I'm trying
00:11:08.270 to say is that from a development point
00:11:10.400 of view it's easy
00:11:11.480 there are deployment challenges but for
00:11:13.520 now let's just leave back to the DevOps
00:11:15.200 to take care of and let's just focus on
00:11:17.420 developing it thank you everyone
00:11:19.130 Oh
Explore all talks recorded at Red Dot Ruby Conference 2017
+12