Outline
Original Post
〈Python Custom CSV Reader Function To Prevent Codec Error〉
When you try to read a csv file in Python, pd.read_csv() is the most common method. However, the process of decoding sometimes causes errors as below, and it may not work even you tune several arguments. For example, encoding=’utf-8′, or engine=’python’.
‘utf-8’ codec can’t decode byte 0xaa in position 103: invalid start byte
Copy, Paste, And Work
To make life easier, I created a function with pandas and csv modules.
In my opinion, you can wrap all custom functions as your own codebase, which will raise your programming efficiency dramatically. You can refer to the other post 〈Import Custom Package/Library/Module In Python〉.
def csv_reader(file, delimiter=',', quotechar='"', header=True):
import csv
# try pd.read_csv
try:
results = pd.read_csv(file)
return results
except:
print('pd.read_csv error.')
row_list = []
with open(file, newline='') as csvfile:
spamreader = csv.reader(csvfile, delimiter=delimiter,
quotechar=quotechar)
for row in spamreader:
row_list.append(row)
# Convert to dataframe ......
if header == True:
cols = row_list[0]
df_row = row_list[1:]
results = pd.DataFrame(columns=cols,
data=df_row)
else:
results = pd.DataFrame(data=row_list)
return results
Related Posts
〈Learn Python And R On DataCamp. Start Your Data Science Career.〉
〈Import Custom Package/Library/Module In Python〉