Skip to content

Python Generators, Generic File Parsing, Data Type Inference, and Frequency Distributions...

Notifications You must be signed in to change notification settings

dseeni/Data_Parser_via_Python_Generators

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Generic CSV Data Paraser using Python Generators:

  • FileReader(self, filename, column_to_track,*, date_column=None)

  • Second row data will automatically determine the data types for .csv file: float, string, integer, date

  • Date restricted only to one column via date_column

  • Return the frequency distribution of data per Column Header via column_to_track

    • Make sure you replace white space with "_" when passing in header names to column_to_track

Processing "nyc_parking_tickets_extract.csv":

Here are the highest frequency of citations sorted by...

  • Vehicle Make: ('TOYOT', 112)

  • Vehicle Body Type: ('SUBN', 352)

  • Violation Description: ('PHTO SCHOOL ZN SPEED VIOLATION', 140)

  • Registration State: ('NY', 779)

  • Issue Date: (datetime.date(2016, 11, 14), 10)

About

Python Generators, Generic File Parsing, Data Type Inference, and Frequency Distributions...

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages