| | |
- builtins.object
-
- DB
class DB(builtins.object) |
| |
The super simple ORM for Data Scientists!
It performs basic CRUD operations on SQLITE3 database.
Observations:
- Columns must be in this format "ColName" or "col_name" (NO SPACES ALLOWED)
- Commas ',' are NOT accepted
- The more data you have the more slower it will be (it uses pandas for working with sqlite)
- It will set a column 'ID' as primary key when you add a table
- All columns (except ID) will have datatype TEXT...
TODO - Implement and test in chunks (chunk_size) feature for bigger tables
For now it's ok if table is under 10.000 rows, if more it will get slow.. |
| |
Methods defined here:
- __init__(self, database_path)
- Require path to database as parameter
- add_row(self, table_name, one_row_df)
- Insert one row in the database.
table_name - name of the table
one_row_df - a pandas dataframe with just one row
- add_rows(self, table_name, df)
- Add multiple rows to an sqlite table one by one.
table_name - name of the table
df - a pandas dataframe (careful on the size may take a while)
You can just use this instead of add_row def
- append_table(self, table_name, df)
- "
Append df to an existing table in the database
- create_table(self, table_name, df)
- "
Create a sqlite table with and ID INTEGER PRIMARY KEY
If a table with that name exists replace that table with the current one
- execute_query(self, query, keep_connection=False)
- Execute query, commit and close query
- get_connection(self)
- Connect to a db and if it not exists creates one with the name given
- get_row(self, table_name, row_index=None, column_name=None, row_value=None, as_dict=False, chunk_size=None)
- get_row from 'table_name' where 'column_name' has this 'row_value'
or get row where ID = row_index
- get_rows(self, table_name, row_indexes=None, column_name=None, row_values=None, as_dict=False, chunk_size=None)
- get_row from 'table_name' where 'column_name' has this 'row_values' (list)
or get row where index = row_indexes (list)
You can just use this instead of get_row def
- get_table(self, table_name, as_dict=False, chunk_size=None)
- Get table from the database as a pandas dataframe
table_name - name of the table
as_dict=False - option to get the df as dict('list') pandas format ex: {"col1": ['a','b','c'], "col2": ['d','f','g']}
(NOT TESTED)chunk_size - is df is to big you can get it in chunks, where chunk_size = number of rows, it will return also the connection
- get_tables(self, table_names, as_dict=False, chunk_size=None)
- Get tables from database in a dict
table_names - names of the table - LIST
as_dict=False - option to get the df as dict('list') pandas format ex: {"col1": ['a','b','c'], "col2": ['d','f','g']}
(NOT TESTED)chunk_size - is df is to big you can get it in chunks, where chunk_size = number of rows, it will return also the connection
- remove_row(self, table_name, one_row_df=None, id_value=None)
- Delete row or rows by maching
- remove_rows(self, table_name, df=None, id_values=None)
- Remove rows from table based on the dataframe or list of ID's given
You can just use this instead of remove_row def
- remove_table(self, table_name)
- Delete/remove table from database
- update_cells(self, table_name, df)
- Update cells of the table in the database
Data descriptors defined here:
- __dict__
- dictionary for instance variables (if defined)
- __weakref__
- list of weak references to the object (if defined)
| |