How to query Kafka Streaming Data?

What if there was way to give analysts an SQL layer over Kafka Streams when Streaming Data.

In one of our projects, we face a situation where the Analyst team needed to work with Streaming Data but they didnt have programming skills. They were however, comfortable with SQL queries. If there was way to give these analysts an SQL layer over Kafka Streams.

KSQL is streaming SQL engine for Kfk, that provides an intertive SQL interfe llwing you to write wer strem ressing queries withut the need fr writing de. KSQL is eseilly det t frud detetin nd rel-time litins.

KSQL rvides slble, distributed strem ressing inluding ggregtins, jins, windwing, nd mre. dditinlly, unlike SQL, whih runs ginst dtbse r bth ressing system, the results f KSQL query re ntinuus. Befre we dive int writing streming queries, lets tke minute t review sme fundmentl nets f KSQL.


KSQL Strems nd Tbles

n event strem is n unbunded strem f individul indeendent events, while the udte r rerd strem is strem f udtes t revius rerds with the sme key.

KSQL hs similr net f querying frm Strem r Tble. Where the Strem is n infinite series f events r fts, but re immutble, but with query n Tble the fts re udtble r n even be deleted.

lthugh sme f the terminlgies might be different, the nets re retty muh the sme, nd if yure mfrtble with Kfk Strems, yull feel right t hme with KSQL.


KSQL rhiteture

KSQL uses Kfk Strems under the vers t build nd feth the results f the query. KSQL is mde u f tw mnents, the KSQL LI nd the KSQL server. Users f stndrd SQL tls suh s MySql, rle, r even Hive will feel right t hme with LI when writing queries in KSQL. Best f ll KSQL is en-sure (he 2.0 liensed).

The LI is ls the lient nneting t the KSQL Server. The KSQL server is resnsible fr ressing the queries nd retrieving dt frm Kfk, s well s writing results int Kfk.

KSQL runs in tw mdes, stndlne, whih is useful fr rttying, nd develment r distributed mde, whih is hw yud use KSQL when wrking in mre relisti sized dt envirnment.

s exiting s KSQL is nd wht it rmises t deliver fr SQL ver streming dt, t the time f this writing, KSQL is nsidered develer review nd its nt suggested t run ginst rdutin lusters.


Listing 1. Strting KSQL in ll mde

./bin/ksql-cli local

fter running the mmnd bve yu shuld see smething like this in yur nsle:

KSQL in ll mde.png


reting KSQL Strem

Getting bk t yur wrk t BSE, yuve been rhed by ne f the nlysts wh is interested in ne f the litins yuve written befre nd wuld like t mke sme tweks t the litin. But nw, insted f this request resulting in mre wrk, yu sin u KSQL nsle nd turn the nlyst lse t renstrut yur litin s n SQL sttement!

The exmle yure ging t nvert is the lst windwed strem frm the intertive queries exmle fund in

sr/min/jv/bbejek/hter_9/StkerfrmneIntertiveQuerylitin.jv frm lines 96103.

In tht litin, yure trking the number shres sld every ten sends by mny tiker symbl.

Yu lredy hve the ti defined (the ti ms t dtbse tble) nd mdel bjet StkTrnstin where the fields n the bjet m t lumns in tble. Even thugh the ti is defined, we need t register this infrmtin with KSQL by using RETE STREM sttement:

Listing 2. reting Strem fund

reting  Strem fund.png

  1. The CREATE STREAM statement named stock_txn_stream
  2. Registering the fields of the StockTransaction object as columns
  3. Specifying the data format and the Kafka topic serving as the source of the stream (both required parameters)

With this ne sttement yure reting KSQL Strem instne tht yu n nw issue queries ginst. In the WITH luse yull ntie tw required rmeters VLUE_FRMT telling KSQL the frmt f the dt nd the KFK_TI rmeter, telling KSQL where t ull the dt frm.

There tw dditinl rmeters yu n use in the WITH luse when reting strem. nes TIMESTM whih ssites the messge timestm with lumn in the KSQL Strem. ertins requiring timestm, suh s windwing, use this lumn t ress the rerd.

The ther is KEY whih ssites the key f the messge with lumn n the defined strem. In ur se the messge key fr the stk-trnstins ti mthes the symbl field in the JSN vlue, nd we didnt need t seify the key.

But hd this nt been the se then yud hve needed t m the key t nmed lumn beuse yull lwys need key t erfrm gruing ertins, whih yull see when we exeute the strem SQL in n uming setin.
With KSQL the mmnd list tis; yull see list f tis n the brker the KSQL LIs inting t nd whether the tis re registered r nt.

fter yuve reted yur new strem yu n view ll strems nd verify KSQL reted the new strem s exeted with the fllwing mmnds:

Listing 3 Listing ll Strems nd desribing the strem yu just reted

show streams;
describestock_txn_stream;

The results f issuing these mmnds gives yu results s demnstrted in figure 4:

Listing ll Strems.png


Yull ntie tw extr lumns RWTIME nd RWKEY tht KSQL hs inserted. The RWTIME lumn is the timestm led n the messge (either frm the rduer r by the brker), nd the RWKEY is the key (if ny) f the messge. Nw tht yuve reted the strem, lets run ur query n this strem.

Original post can be found here.

Interested in upgrading your skills? Check out our trainings.
 
Software Development Engineer

Share the knowledge

Still have questions?
Connect with us
Thank you.
Your request has been received.
Thank you!
The form has been submitted successfully.