Skip to content

This repository hosts Scala + Spark code to present how to populate a new column using each row's values.

Notifications You must be signed in to change notification settings

francismaria/RowBasedColumnPopulation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Row-based Column Values Population

This repository hosts a simple project presenting how to create a new column in a Spark dataframe based on all the values of each row.

Usage

Input
sbt

> run

> test
Output
+---------+-----+------+-----+------------+
|repair_id|brand|engine|tires|valid_record|
+---------+-----+------+-----+------------+
|        1|  BMW|   s4t|  255|         YES|
|        2| Fiat|   3er|  245|         YES|
|        3| Audi|  null| null|          NO|
+---------+-----+------+-----+------------+

About

This repository hosts Scala + Spark code to present how to populate a new column using each row's values.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages