Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sqlite Changes #472

Merged
merged 23 commits into from
Aug 22, 2024
Merged

Sqlite Changes #472

merged 23 commits into from
Aug 22, 2024

Conversation

shreyas-damle
Copy link
Collaborator

@shreyas-damle shreyas-damle commented Aug 13, 2024

Items covered:

  • DB layer to configure storage as file vs db and for db, default value is sqlite as of now.
  • APIs done so far:
    • /discover
    • /prompt
  • entity/topic classifier changes to include confidence score.

Pending items:

  • UTs

dristysrivastava and others added 5 commits August 7, 2024 15:42
* Adding confidence score to entities and topic responses

* Adding labels and fixing UTS

* Adding utils

* Fixing UT

* Remove unused imports

* Updating topic classifier

---------

Co-authored-by: dristy.cd <[email protected]>
* SQLite basic code structure

* Changes after testing

* Ruff fixes
* Adding confidence score to entities and topic responses

* Adding labels and fixing UTS

* Adding utils

* Fixing UT

* Remove unused imports

* Updating topic classifier

---------

Co-authored-by: dristy.cd <[email protected]>
* Added DB implementation for /prompt API

* Updating prompt service

---------

Co-authored-by: dristy.cd <[email protected]>
* Discovery api fixes, Update and datetime converted to isoformat

* Storage type code optimization

* Optimized the way handler method was called based on storage type.

* Changes for prompt api

* Changes after testing
* Adding confidence score to entities and topic responses (#460)

* Adding confidence score to entities and topic responses

* Adding labels and fixing UTS

* Adding utils

* Fixing UT

* Remove unused imports

* Updating topic classifier

---------

Co-authored-by: dristy.cd <[email protected]>

* Added changes for prompt group

* resolved linting issue

* added changes for confidence score for entity classification

* added changes for confidence score

* review comment changes

* review comment changes

* review comment changes

---------

Co-authored-by: Dristy Srivastava <[email protected]>
Co-authored-by: dristy.cd <[email protected]>
Copy link
Collaborator

@srics srics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial comments

@@ -30,6 +30,7 @@ class CacheDir(Enum):
f"http://{config_details.get('daemon', {}).get('host')}:"
f"{config_details.get('daemon', {}).get('port')}"
)
SQLITE_ENGINE = "sqlite:///{}/pebblo.db"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the path/location to this .db file configurable with a proper default location

pebblo/app/config/config.py Outdated Show resolved Hide resolved
pebblo/app/config/config.py Show resolved Hide resolved
pebblo/app/config/config_validation.py Show resolved Hide resolved
gr8nishan and others added 10 commits August 14, 2024 20:34
* Adding confidence score to entities and topic responses (#460)

* Adding confidence score to entities and topic responses

* Adding labels and fixing UTS

* Adding utils

* Fixing UT

* Remove unused imports

* Updating topic classifier

---------

Co-authored-by: dristy.cd <[email protected]>

* Added changes for prompt group

* resolved linting issue

* conflic check fixes

* added changes for adding ip address in entity classifier

* added test case for ip address

---------

Co-authored-by: Dristy Srivastava <[email protected]>
Co-authored-by: dristy.cd <[email protected]>
* Storage config code optimized

* Loader doc changes
* Chnages for confidence score in /loader/doc as well as UI API

* Fixing UTs

---------

Co-authored-by: dristy.cd <[email protected]>
* Changes regarding Local UI APIs

* Adding AiUser Table

* Adding Dashboard API for retrivals

* Incorporated code review comments

* Adding new filr for local ui APIs for DB

* Updated retrievre app API

* Added username in AiRetrievals

---------

Co-authored-by: dristy.cd <[email protected]>
* Loader doc changes

* Renamed sqlite db file name.

* Changed docIds to snippetIds

* Add feature to show execution time for functions with database operations

* LoadId's added in all sql tables
#482)

* Adding confidence score to entities and topic responses (#460)

* Adding confidence score to entities and topic responses

* Adding labels and fixing UTS

* Adding utils

* Fixing UT

* Remove unused imports

* Updating topic classifier

---------

Co-authored-by: dristy.cd <[email protected]>

* Added changes for prompt group

* resolved linting issue

* conflic check fixes

* added changes for adding ip address in entity classifier

* added test case for ip address

* added fix for entity classification happening in prompt api for prompt

---------

Co-authored-by: Dristy Srivastava <[email protected]>
Co-authored-by: dristy.cd <[email protected]>
* Fixed issues found while testing.

* Ruff fixes.
* Adding retriever API for local UI for DB Apps

* Adding delete API for Retriever DB apps
* Added local-ui utils file
@shreyas-damle shreyas-damle changed the base branch from pebblo-0.1.18 to main August 21, 2024 17:40
@shreyas-damle shreyas-damle changed the base branch from main to pebblo-0.1.18 August 21, 2024 17:40
@shreyas-damle shreyas-damle changed the base branch from pebblo-0.1.18 to main August 21, 2024 17:40
* Local UI - LoaderApp details page UI.

* Fixed commit issue by adding flag_modified() method.

* Fixed regression on local UI with file storage.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these table_name comes from a ENUM?

AiUser, ai_user_obj.dict()
)
if insert_status:
logger.debug(f"Entry: {entry} in AiUser completed")
Copy link
Collaborator

@gr8nishan gr8nishan Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to log "entry"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dristysrivastava Can you please check this?

return datetime.now().isoformat()


def timeit(func):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can timeit function have a condition in the beginning because everytime we will calculate the time for each function but not log it.. So can it only run when the deubg mode is on else it just returns from the first line itself

return doc_info
except Exception as e:
logger.error(
f"Get Classifier Response Failed for doc: {doc}, Exception: {e}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to log doc. Can we do it with some id etc... ?

else:
# If entity or topic does not exist in app coll
restricted_data[data] = {
"count": doc_restricted_data.get(data, 0),
Copy link
Collaborator

@gr8nishan gr8nishan Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we can directly assign 0 in "count" no need for get operation

Aggregating all input, processing the data, and generating the final report
"""
logger.debug("Generating final report")
logger.debug(f"LoaderApp: {app_data}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to log raw data and app data

:return: sorted retrievals list
"""
sorted_data = sorted(
retrievals, key=lambda item: len(item["retrievals"]), reverse=True
Copy link
Collaborator

@gr8nishan gr8nishan Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use get len(item["retrievals"]),?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we are using len() and the apply sorted() on top of it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

* Updating /prompt/ as well as /delete API

* Removed multiple commit statements

* Moved response models in db response file

* Updating models

* Updating model fields

---------

Co-authored-by: dristy.cd <[email protected]>
def _get_app_type_and_class(self):
AppClass = None
app_type = None
load_id = self.data.get("load_id") or None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add None in get function itself

@timeit
def process_request(self, data):
try:
self.db = SQLiteClient()
Copy link
Collaborator

@gr8nishan gr8nishan Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are creating SqliteClient in every request. Is it possible to create a db client once and use it across when the pebblo server starts? Maybe a Singleton class or any other option?

dristysrivastava and others added 2 commits August 22, 2024 15:18
* Generate PDF at the end of loder/doc API.

* Delete loader app
@shreyas-damle shreyas-damle marked this pull request as ready for review August 22, 2024 10:01

exist, existing_ai_app = db.query(app_class, ai_app)
if exist and existing_ai_app:
logger.debug(f"Application details exists in {app_class.__tablename__}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a specific prefix in the logger message like - LoaderApp: as we had done in last release

else:
raise Exception(result)
except Exception as err:
message = f"PDF report is not generated. Error: {err}"
Copy link
Collaborator

@gr8nishan gr8nishan Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this message is going to the UI should we remove ex ?
Should this be done across all apis ..

else:
# Commit will only happen when everything went well.
message = "App Discover Request Processed Successfully"
logger.debug(message)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add an idetifier in the log messsage
This should be done in all the log messages

dristysrivastava and others added 2 commits August 22, 2024 15:51
* Made sqlite table creation conditional.

* Test cases fixed

* Ruff fixes.
@shreyas-damle shreyas-damle merged commit 34ff421 into main Aug 22, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants