Python 유용한 코드

Notice

Recent Posts

Recent Comments

Link

250x250

« 2025/10 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

진지한 개발자

Python 유용한 코드 본문

IT/Python

Python 유용한 코드

제이_엔 2023. 5. 20. 11:13

728x90

타입 변환

df['column_a'].str.slice(start=13, stop=15)    # column_a : <str>
df["time"] = df["timestamp"].date				   # time : <timestamp>	
df["time"] = df["timestamp"].time

from datetime import datetime
df['timestamp'] = df['timestamp'].apply(lambda x: datetime.strptime(x, '%Y-%m-%d %H:%M:%S.%f'))

df['Dates'] = pd.to_datetime(df['date']).dt.date
df['Time'] = pd.to_datetime(df['date']).dt.time

파일 읽기

# !pip install awswrangler==2.19.0
import awswrangler as wr
df = wr.s3.read_parquet(path='s3://xxxxx/')
df.head()

전처리

# 특정 파티션별 가장 최근의 row만 집계하고자 할때
df.sort_values('date').groupby(['date', 'check_date']).tail(1)

# s3 특정 경로의 파일을 조회할때
import boto3, s3fs, pandas as pd
sess = boto3.Session()
my_bucket = sess.resource('s3').Bucket('<my-bucket>')
s3 = s3fs.S3FileSystem()

df = pd.DataFrame()
file_lkey_list = []
for file_key in my_bucket.objects.filter(Prefix=f'xxxx/wetb/xxxxee')
	file_key_list.append(file_key)
    
    
# s3에서 gzip 파일을 읽는 경우
import s3fs, gzip, pandas as pd

s3 = s3fs.S3FileSystem()
with s3.open('s3://<my-bucket>/'+key, 'rb') as s3fs:
	with gzip.open(s3fs, 'r') as f:
    	df = pd.read_csv(f)

728x90

'IT > Python' 카테고리의 다른 글

unzip (0)	2023.08.26
S3 file 지우기 (0)	2023.06.29
특정 이름의 S3 파일을 지우고 싶을 때 (0)	2023.06.09
Convert GroupBy Series to DataFrame (0)	2023.06.05
pip 설치 시 SSLError 오류 해결방법 (SSLCertVerificationError) (0)	2023.05.22

'IT/Python' Related Articles

진지한 개발자

Python 유용한 코드 본문

Python 유용한 코드

'IT > Python' 카테고리의 다른 글

티스토리툴바