PySpark Preprocessing

Notice

Recent Posts

Recent Comments

Link

250x250

« 2025/10 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

글쓰기
방명록
RSS
관리

진지한 개발자

PySpark Preprocessing 본문

IT/spark

PySpark Preprocessing

제이_엔 2023. 8. 25. 11:29

728x90

# convert each row in DataFrame to list of integer
df.col_2 = df.col2.map(lambda x: [int(e) for e in x])

df_spark = spark.createDataFrame(df)
df_spark.select('col_1', explode(col('col_2')).alias('cols_2')).show(10)

728x90

'IT > spark' 카테고리의 다른 글

Pyspark 사용예 (HDFS) (0)	2025.02.18
PySpark json flatten case (0)	2023.08.25
PySpark의 UDF 예제 (0)	2023.07.31
PySpark 특징 및 장점 (0)	2023.07.31
Pyspark 예제 실행 (0)	2023.04.19

'IT/spark' Related Articles

Pyspark 사용예 (HDFS) 2025.02.18
PySpark json flatten case 2023.08.25
PySpark의 UDF 예제 2023.07.31
PySpark 특징 및 장점 2023.07.31

진지한 개발자

PySpark Preprocessing 본문

PySpark Preprocessing

'IT > spark' 카테고리의 다른 글

티스토리툴바