Thursday, May 11, 2023

Spark- Window Function

 

Window functions in Spark

================================================





-> Spark Window functions operate on a group of rows like partition and return a single value for every input row. Spark SQL supports three kinds of window functions:

a) Ranking functions

b) Analytic functions

c) Aggregate functions



Ranking Functions:
=============

-> ROW_NUMBER(): It is used to get a unique sequential number for each row in the specified data.

-> RANK(): It is used to provide a rank to the result within a window partition. This function leaves gaps in rank when there are ties.

-> DENSE_RANK(): is used to get the result with rank of rows within a window partition without any gaps. This is similar to rank() function difference being rank function leaves gaps in rank when there are ties.

-> NTILE(): It is used to distribute the number of rows in the specified (N) number of groups. Each row group gets its rank as per the specified condition. We need to specify the value for the desired number of groups.

✔️Without use of partition by :

The NTILE(2) shows that we require a group of two records in the result.

✔️With the use of Partition by:

The NTILE(2), each partition in department group is divided into two groups.

✔️ Code Snippet:

 ๐˜ท๐˜ข๐˜ญ ๐˜ฅ๐˜ง = ๐˜š๐˜ฆ๐˜ฒ((101,"๐˜”๐˜ฐ๐˜ฉ๐˜ข๐˜ฏ","๐˜ˆ๐˜ฅ๐˜ฎ๐˜ช๐˜ฏ",4000),
  (102, "๐˜™๐˜ข๐˜ซ๐˜ฌ๐˜ถ๐˜ฎ๐˜ข๐˜ณ", "๐˜๐˜™", 5000),
  (103, "๐˜ˆ๐˜ฌ๐˜ฃ๐˜ข๐˜ณ", "๐˜๐˜›",9990),
  (104, "๐˜‹๐˜ฐ๐˜ณ๐˜ท๐˜ช๐˜ฏ", "๐˜๐˜ช๐˜ฏ๐˜ข๐˜ฏ๐˜ค๐˜ฆ", 7000),
  (105, "๐˜™๐˜ฐ๐˜ฉ๐˜ช๐˜ต", "๐˜๐˜™", 3000),
  (106, "๐˜™๐˜ข๐˜ซ๐˜ฆ๐˜ด๐˜ฉ", "๐˜๐˜ช๐˜ฏ๐˜ข๐˜ฏ๐˜ค๐˜ฆ",9800),
  (107, "๐˜—๐˜ณ๐˜ฆ๐˜ฆ๐˜ต", "๐˜๐˜™", 7000),
  (108, "๐˜”๐˜ข๐˜ณ๐˜บ๐˜ข๐˜ฎ", "๐˜ˆ๐˜ฅ๐˜ฎ๐˜ช๐˜ฏ",8000),
  (109, "๐˜š๐˜ข๐˜ฏ๐˜ซ๐˜ข๐˜บ", "๐˜๐˜›", 7000),
  (110, "๐˜๐˜ข๐˜ด๐˜ถ๐˜ฅ๐˜ฉ๐˜ข", "๐˜๐˜›", 7000),
 (111, "๐˜”๐˜ฆ๐˜ญ๐˜ช๐˜ฏ๐˜ฅ๐˜ข", "๐˜๐˜›", 8000),
  (112, "๐˜’๐˜ฐ๐˜ฎ๐˜ข๐˜ญ", "๐˜๐˜›", 10000))

 ๐˜ช๐˜ฎ๐˜ฑ๐˜ฐ๐˜ณ๐˜ต ๐˜ด๐˜ฑ๐˜ข๐˜ณ๐˜ฌ.๐˜ช๐˜ฎ๐˜ฑ๐˜ญ๐˜ช๐˜ค๐˜ช๐˜ต๐˜ด._

 ๐˜ท๐˜ข๐˜ญ ๐˜ฅ๐˜ง2 = ๐˜ฅ๐˜ง.๐˜ต๐˜ฐ๐˜‹๐˜("๐˜ช๐˜ฅ","๐˜•๐˜ข๐˜ฎ๐˜ฆ","๐˜‹๐˜ฆ๐˜ฑ๐˜ข๐˜ณ๐˜ต๐˜ฎ๐˜ฆ๐˜ฏ๐˜ต","๐˜š๐˜ข๐˜ญ๐˜ข๐˜ณ๐˜บ")
 
 ๐˜ท๐˜ข๐˜ญ ๐˜ธ๐˜ช๐˜ฏ๐˜ฅ๐˜ฐ๐˜ธ = ๐˜ž๐˜ช๐˜ฏ๐˜ฅ๐˜ฐ๐˜ธ.๐˜ฑ๐˜ข๐˜ณ๐˜ต๐˜ช๐˜ต๐˜ช๐˜ฐ๐˜ฏ๐˜‰๐˜บ("๐˜‹๐˜ฆ๐˜ฑ๐˜ข๐˜ณ๐˜ต๐˜ฎ๐˜ฆ๐˜ฏ๐˜ต").๐˜ฐ๐˜ณ๐˜ฅ๐˜ฆ๐˜ณ๐˜‰๐˜บ("๐˜š๐˜ข๐˜ญ๐˜ข๐˜ณ๐˜บ")

๐˜ฅ๐˜ง2.๐˜ธ๐˜ช๐˜ต๐˜ฉ๐˜Š๐˜ฐ๐˜ญ๐˜ถ๐˜ฎ๐˜ฏ("๐˜ณ๐˜ฐ๐˜ธ_๐˜ฏ๐˜ถ๐˜ฎ๐˜ฃ๐˜ฆ๐˜ณ",๐˜ณ๐˜ฐ๐˜ธ_๐˜ฏ๐˜ถ๐˜ฎ๐˜ฃ๐˜ฆ๐˜ณ.๐˜ฐ๐˜ท๐˜ฆ๐˜ณ(๐˜ธ๐˜ช๐˜ฏ๐˜ฅ๐˜ฐ๐˜ธ))
  .๐˜ธ๐˜ช๐˜ต๐˜ฉ๐˜Š๐˜ฐ๐˜ญ๐˜ถ๐˜ฎ๐˜ฏ("๐˜ณ๐˜ข๐˜ฏ๐˜ฌ",๐˜ณ๐˜ข๐˜ฏ๐˜ฌ().๐˜ฐ๐˜ท๐˜ฆ๐˜ณ(๐˜ธ๐˜ช๐˜ฏ๐˜ฅ๐˜ฐ๐˜ธ))
  .๐˜ธ๐˜ช๐˜ต๐˜ฉ๐˜Š๐˜ฐ๐˜ญ๐˜ถ๐˜ฎ๐˜ฏ("๐˜ฅ๐˜ฆ๐˜ฏ๐˜ด๐˜ฆ_๐˜ณ๐˜ข๐˜ฏ๐˜ฌ",๐˜ฅ๐˜ฆ๐˜ฏ๐˜ด๐˜ฆ_๐˜ณ๐˜ข๐˜ฏ๐˜ฌ().๐˜ฐ๐˜ท๐˜ฆ๐˜ณ(๐˜ธ๐˜ช๐˜ฏ๐˜ฅ๐˜ฐ๐˜ธ))
  .๐˜ธ๐˜ช๐˜ต๐˜ฉ๐˜Š๐˜ฐ๐˜ญ๐˜ถ๐˜ฎ๐˜ฏ("๐˜ฏ๐˜ต๐˜ช๐˜ญ๐˜ฆ",๐˜ฏ๐˜ต๐˜ช๐˜ญ๐˜ฆ(2).๐˜ฐ๐˜ท๐˜ฆ๐˜ณ(๐˜ธ๐˜ช๐˜ฏ๐˜ฅ๐˜ฐ๐˜ธ))
 .๐˜ด๐˜ฉ๐˜ฐ๐˜ธ()


Want to become a Data Architect?

 

Are you looking for a resource for learning to become a Data Architect?

Data Architect is one of the highest roles in the data world. To become a Data Architect, one should be an expert /pro in many data-related areas like Database and DWH management, ETL, design and development of big data applications, and Visualization.

Check here the guide & resources to become a data Architect here.....

๐Ÿ“ ๐——๐—ฎ๐˜๐—ฎ ๐—ช๐—ฎ๐—ฟ๐—ฒ๐—ต๐—ผ๐˜‚๐˜€๐—ถ๐—ป๐—ด.....
A data warehouse allows stakeholders to make well-informed business decisions by supporting the process of drawing meaningful conclusions through data analytics.

Learn here for FREE: https://www.youtube.com/watch?v=CHYPF7jxlik

๐Ÿ“ ๐——๐—ฎ๐˜๐—ฎ๐—ฏ๐—ฎ๐˜€๐—ฒ ๐— ๐—ฎ๐—ป๐—ฎ๐—ด๐—ฒ๐—บ๐—ฒ๐—ป๐˜....
Having a solid foundation in database management will help data engineers build, design, and maintain the overall data infrastructure that supports the business requirements.

Learn here for FREE: https://lnkd.in/dqVJr2yv

๐Ÿ“ ๐—˜๐—ง๐—Ÿ........
data engineers have to work in teams and extract data from various sources, transform it into a reliable form, and load that into the systems other teams of data science professionals can use to build other relevant applications.

Learn here for FREE: https://lnkd.in/geaxTyKr

๐Ÿ“ ๐—•๐—ถ๐—ด ๐——๐—ฎ๐˜๐—ฎ ๐—ง๐—ผ๐—ผ๐—น๐˜€ ๐—ฎ๐—ป๐—ฑ ๐—ง๐—ฒ๐—ฐ๐—ต๐—ป๐—ผ๐—น๐—ผ๐—ด๐—ถ๐—ฒ๐˜€..........
With vast amounts of data generated every second, companies are now dealing with the problem of efficiently handling and storing petabytes-sized data. And the top tools to handle such big data through distributed processing are Apache Hadoop and Apache Spark.

FREE data engineering roadmap here: https://lnkd.in/g_2BVCFp

๐Ÿ“ ๐—–๐—น๐—ผ๐˜‚๐—ฑ ๐——๐—ฎ๐˜๐—ฎ ๐—˜๐—ป๐—ด๐—ถ๐—ป๐—ฒ๐—ฒ๐—ฟ๐—ถ๐—ป๐—ด ๐—ฆ๐—ธ๐—ถ๐—น๐—น๐˜€......
Learn Azure here: https://lnkd.in/g44QFhWZ

Learn AWS here: https://lnkd.in/gYbF9_4H

Learn GCP here: https://lnkd.in/dc98RFBN

๐Ÿ“ ๐——๐—ฎ๐˜๐—ฎ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น๐—ถ๐—ป๐—ด ๐—ฎ๐—ป๐—ฑ ๐—ฆ๐—ฐ๐—ต๐—ฒ๐—บ๐—ฎ ๐——๐—ฒ๐˜€๐—ถ๐—ด๐—ป.....
https://lnkd.in/duTGjk6H

๐Ÿ“ ๐——๐—ฎ๐˜๐—ฎ ๐—ฉ๐—ถ๐˜€๐˜‚๐—ฎ๐—น๐—ถ๐˜‡๐—ฎ๐˜๐—ถ๐—ผ๐—ป......
The data engineering role often involves using data visualization tools to better understand data and its features.
Learn Power BI: https://lnkd.in/gV2Kn9M8

๐Ÿ“ ๐—ฅ๐—ฒ๐—ฎ๐—น-๐—ง๐—ถ๐—บ๐—ฒ ๐——๐—ฎ๐˜๐—ฎ ๐—ฃ๐—ฟ๐—ผ๐—ฐ๐—ฒ๐˜€๐˜€๐—ถ๐—ป๐—ด ๐—™๐—ฟ๐—ฎ๐—บ๐—ฒ๐˜„๐—ผ๐—ฟ๐—ธ๐˜€....
Knowledge of data processing frameworks is crucial for data engineers as they are responsible for streaming data.

Learn Kafka Here for FREE: https://lnkd.in/gqxpf43i

๐Ÿ“ ๐—ฃ๐—ฟ๐—ผ๐—ด๐—ฟ๐—ฎ๐—บ๐—บ๐—ถ๐—ป๐—ด ๐—Ÿ๐—ฎ๐—ป๐—ด๐˜‚๐—ฎ๐—ด๐—ฒ......
Learn Python/scala here: https://lnkd.in/ggMZRfpf
https://lnkd.in/gSzSiW6h

Learn SQL here: https://lnkd.in/gAGY-vX3

๐Ÿ“ Portfolio projects...
Build a strong #dataengineering portfolio here.....
https://lnkd.in/gwzyHuu9

This is an #excellent roadmap to succeed in your in-data journey...

these are the best resources and are definitely recommended.

Spark- Window Function

  Window functions in Spark ================================================ -> Spark Window functions operate on a group of rows like pa...