Releases: daxa-ai/pebblo
Releases · daxa-ai/pebblo
v0.1.20
Key Highlights
- Support to request anonymization on a per-SafeLoader invocation
- Pebblo TextLoader support
- Updated Entity Classifier
- Added support for Azure Client Secret
- Added support for email-address
- Improved credit-card matching
- Sqlite state storage is now a supported and preferred option
Deprecation Notice
- File-based state storage will be deprecated in the next release,
v0.1.21
. It is recommended to switch totype: db
in the pebblo configuration. See https://daxa-ai.github.io/pebblo/config#storage for more details.
Configuration
For a custom configuration file, please refer to https://daxa-ai.github.io/pebblo/config#default-configuration for the supported fields
What's Changed
- Revert docker_release.yml changes by @srics in #551
- Add Ruff sort imports command to the format_diff target by @Raj725 in #549
- added changes for documentation 0.19 by @gr8nishan in #556
- added multi-arch build for docker. by @rajburnwal07 in #553
- New field added client-secret for Azure client secret ID. by @dristysrivastava in #555
- Samples: text loader sample using PebbloTextLoader by @Raj725 in #539
- Adding anonymize_snippet as a input filed in /loader/doc API by @dristysrivastava in #558
- Adding deprecation message for db storage type as file in config by @dristysrivastava in #559
- Added Pebblo Docs Version - 0.1.19 by @rutujaac in #563
- docs: update README.md by @eltociear in #562
- Added the last released versions of langchain-community to run tests by @Rohan-sss1 in #561
- added sonarscan workflow. by @rajburnwal07 in #566
- Adding anonymizeSnippets in reports config by @dristysrivastava in #569
- Returning default value if no config path is passed by @dristysrivastava in #571
- samples: Added readme file, metadata to Text loader sample by @Raj725 in #565
- Update env var for Sonarscan workflow by @rajburnwal07 in #570
- Add LinkedIn shield button by @srics in #572
- Updated anonymizeSnippets location from classifier to reports in config.yaml by @dristysrivastava in #573
- Updating file deprecation msg by @dristysrivastava in #574
- Update LinkedIn URL by @sridhar-daxa in #576
- Update pebblo version to 0.1.20rc1 by @srics in #577
- DB query optimization and reducing sqlalchemy logs by @shreyas-damle in #575
- Added changes for Email Address and Credit Card Number by @gr8nishan in #580
- Docs: table for Pebblo Safe loader args/parameters description by @Raj725 in #568
- added changes for email address documentation by @gr8nishan in #582
- Update pebblo version to 0.1.20 by @shreyas-damle in #583
New Contributors
- @Rohan-sss1 made their first contribution in #561
Full Changelog: v0.1.19...v0.1.20
v0.1.19
Key Highlights
- Support for new Entities types:
- GitHub Finegrained Personal Access Token
- RSA, EC, DSA Private Keys
- OpenSSH Private Key
- Google Account Private Key
- Improved configuration validation
- Improvements to SQLite DB state storage subsystem
- Security fixes
What's Changed
- Ensure the build module is installed before running build command by @Raj725 in #511
- Update Pebblo Docs Footer by @rutujaac in #517
- #515 raise if --config is passed but file does not exist by @aramnhammer in #516
- Upversion pebblo to 0.1.19rc1 by @srics in #525
- Bugfix/#514 move weasyprint dependency check to config validation by @aramnhammer in #524
- added changes for github and slack token by @gr8nishan in #512
- Confidence score changes for DB by @dristysrivastava in #523
- SQLite - Minor fixes by @shreyas-damle in #521
- Fix entity details response to return based on location order by @dristysrivastava in #528
- Remove pinned pkg versions to avoid dependency error by @srics in #513
- Fix the ruff command error by @Raj725 in #520
- DB load history by @shreyas-damle in #529
- Add support for private keys by @gr8nishan in #527
- Fixed timeit issue by @shreyas-damle in #531
- Fixed text overflow on snippet details "snippet" string by @rutujaac in #526
- Fix Data source path overflow issue by @rutujaac in #500
- Dependabot fixes by @Raj725 in #534
- Add lint-fix target by @srics in #532
- added changes for new api for entity and topic classification by @gr8nishan in #518
- Fixed the issue where app was being shown multiple times on local UI by @shreyas-damle in #535
- Added new Pebblo Docs version by @rutujaac in #536
- Fix sorting on findings-total by @rutujaac in #537
- Swagger changes in pebblo server APIs by @dristysrivastava in #530
- Adding mode knob in config under classifier by @dristysrivastava in #538
- Showing anonymized data in DB apps by @dristysrivastava in #533
- Adding classifier mode changes for file type storage by @dristysrivastava in #540
- Fixed integration tests by @shreyas-damle in #541
- Show zero state for App Finding tabs by @rutujaac in #544
- Fix Security Issues by @rutujaac in #542
- Show zero in tables on Local UI by @rutujaac in #545
- Update version to 0.1.19 by @srics in #546
- Fixing client_version error by @dristysrivastava in #547
New Contributors
- @aramnhammer made their first contribution in #516
Full Changelog: v0.1.18...v0.1.19
v0.1.18
Key Highlights
- Confidence score in Local UI for document snippets
- IP address as a supported entity
- Improved Entity Classifier accuracy
- SQLite DB as an option for state storage (beta)
- Stability improvements in Pebblo Client in Langchain
- Versioned Pebblo documentation
What's Changed
- Update docker-bake.hcl by @siddheshwar-more in #456
- made changes to docs for prompt governance by @gr8nishan in #457
- added changes to merge classes NDA_AGREEMENT, SETTLEMENT_AGREEMENT, D… by @gr8nishan in #458
- Adding test log config in UTs by @dristysrivastava in #461
- Integration test fix by @shreyas-damle in #462
- Adding client version to report.json file by @dristysrivastava in #465
- Setup: Documentation Versioning by @rutujaac in #464
- Pebblo 0.1.18 to main by @shreyas-damle in #486
- Pebblo Docs - Hide "Version not actively maintained" note by @rutujaac in #487
- Sqlite Changes by @shreyas-damle in #472
- Retrieval App Bug Fix by @shreyas-damle in #495
- Added documentation for LlamaIndex Pebblo Safe DataLoader by @yograjopcito in #469
- Update client version field in PDF reports and local UI app details page by @rutujaac in #463
- DB Retriever app prompt with findings fix by @dristysrivastava in #496
- Loader doc db response by @shreyas-damle in #499
- Fixed version error in case of empty clientVersion by @rutujaac in #501
- Fixed loader/doc issues w.r.t. loadId by @shreyas-damle in #504
- Fixing loader API changes by @dristysrivastava in #502
- added slack notification in case of build failure. by @rajburnwal07 in #490
- Added new version - python 3.12.5 for UTs by @shreyas-damle in #509
- Updated pydantic version and relevant model changes by @shreyas-damle in #505
- Update pebblo version to 0.1.18 by @srics in #510
- Add Langchain tab in Pebblo Docs by @rutujaac in #497
- Fix Snippets Listing Page - Snippets Count issue by @rutujaac in #503
- Documentation for storage type by @shreyas-damle in #508
- Confidence Score fix to highlight only the entity which is present in the label name by @dristysrivastava in #498
- Updated doc for 0.1.18 version by @dristysrivastava in #506
New Contributors
- @rajburnwal07 made their first contribution in #490
Full Changelog: v0.1.17...v0.1.18
v0.1.17
Key Highlights
- Prompt Governance (beta) to find and flag restricted entities like secret-keys in prompt text
- Checksum based RAG context to document source mapping
- Logging enhancements, to both console and file logs with rotation
- Docker config updates
- Documentation updates based on user community feedback
- Security fixes for dependency packages
- Enhanced samples including for SharePoint-Postgres
What's Changed
- Pinned python 3.12 version for UTs by @shreyas-damle in #401
- Excluding RAG apps from ruff check by @dristysrivastava in #405
- docs: Add SharePointLoader and Postgres PGVector supported list by @Raj725 in #404
- Replacing doc with checksum by @dristysrivastava in #402
- Updated samples: Added SharePoint-Postgres, Rearranged existing samples, Updated docs by @Raj725 in #406
- docs: update README.md by @eltociear in #407
- Sharepoint samples: Updated readme, requirements and token check by @Raj725 in #410
- Remove text-capitalise on app names on Pebblo UI by @rutujaac in #408
- Index for sample apps by @Raj725 in #409
- Update README and requirements for samples by @srics in #411
- Fix for blank retrieval app page by @shreyas-damle in #413
- Pinned python-bidi version to 0.4.2 by @shreyas-damle in #419
- Sort Apps on app listing page based on findings by @shreyas-damle in #417
- Add Delete App by @shreyas-damle in #416
- Fix Dependabot findings in Pebblo samples by @Raj725 in #422
- Fix Dependabot findings in Pebblo by @shreyas-damle in #423
- Docker latest tag from pipeline by @siddheshwar-more in #418
- Display pebblo version by @srics in #426
- Pebblo UI - Delete App UI by @rutujaac in #428
- Docker pebblo config to .pebblo by @siddheshwar-more in #431
- Update pyproject.toml by @srics in #435
- Updating pebblo version by @dristysrivastava in #439
- Update tests.yml by @siddheshwar-more in #442
- Enhanced PDF reporting docker pebblo by @siddheshwar-more in #420
- Pebblo UI : Fix http to https auto upgrade issue by @rutujaac in #412
- Delete App for Retrieval Apps by @rutujaac in #438
- Docker docs update by @siddheshwar-more in #440
- Log messages updated by @shreyas-damle in #441
- Sort prompt retrievals based on time by @shreyas-damle in #447
- Sort Loader apps based on total findings by @shreyas-damle in #446
- Adding langchain client version by @dristysrivastava in #414
- Add combined file and console logger by @srics in #433
- Prompt governance by @gr8nishan in #430
- Word break: all if overflows column width by @rutujaac in #382
- Added changes for try catch in al app retrieval by @gr8nishan in #451
- Updating pebblo config and logs by @dristysrivastava in #427
- Update docs for cacheDir and logging changes by @srics in #453
- Update release version to 0.1.17 by @srics in #454
- Update Pebblo client key by @rutujaac in #452
- Revert "Update Pebblo client key" by @srics in #455
New Contributors
- @eltociear made their first contribution in #407
Full Changelog: v0.1.16...v0.1.17
v0.1.16
Key Highlights
- Support for Sharepoint loader with authorized identities and extended metadata
- Updated
pebblo-classifier
to Version 9 with improved topic accuracy - Support for Postgress PGVector VectorDB for identity and semantic-topic enforcement
- Documentation updates for
PebbloRetrieval
chain with samples for identity and semantic-topic enforcement
What's Changed
- Adding lint fixes by @dristysrivastava in #380
- Updated UT and minor linting fixes by @shreyas-damle in #381
- Updated documentation for Pebblo UI by @rutujaac in #387
- Documentation for PebbloRetrievalQA chain by @Raj725 in #388
- SafeRetriever docs and README updates by @srics in #389
- Updating field validator for prompt API by @dristysrivastava in #379
- Handle bubble chart height for last number of finding data by @rutujaac in #383
- Skip pkg cache to avoid hash mistmatch by @srics in #384
- Fix test pypi pkg naming to include yr.mo.dt.hh.mm by @srics in #385
- Add GoogleDriveLoader to the list of supported loaders by Pebblo by @rahul-trip in #390
- Cleaning up discovery API response by @dristysrivastava in #396
- added changes to integrate v9 version of topic classifier by @gr8nishan in #397
- Updating sensitive data for different classification by @dineshkrr in #394
- Update langchain library versions and imports in the sample apps by @Raj725 in #398
- Add docs for Semantic Enforcement RAG using PebbloRetrievalQA by @Raj725 in #392
- add sharepoint retriever app. by @rahul-trip in #399
- Update release version by @srics in #400
New Contributors
- @dineshkrr made their first contribution in #394
Full Changelog: v0.1.15...v0.1.16
v0.1.15
Key Highlights
- Dedicated
Safe Retriever
Local UI page for Visualization of AI apps usingPebbloRetrievalChain
- Visualize top users using LLM inference, including their
authorization groups
- Visualize top documents retrieved from Vector DB to facilitate data governance and compliance
- Kubernetes deployment of pebblo server docker image
- Updated samples with upstreamed
langchain-google
pkg for authorization enabledGoogleDriveLoader
- Extended metadata support for
GoogleDriveLoader
- document paths, names, and sizes - Initial Pebblo support for LlamaIndex
- Topic and Entity name enhancements in pebblo-schema
What's Changed
- Log and print statement were not working in topic classfiier by @gr8nishan in #309
- Changes for showing pebblo server and client version on DAXA UI. by @dristysrivastava in #319
- Entity Classifier Fix by @shreyas-damle in #325
- Removed text capitalize on file name in local UI by @rutujaac in #327
- Updating entities and Topics names from human readable format to programming format by @dristysrivastava in #306
- Added keyword to string mapping in reports and ui by @rutujaac in #326
- Example for Llama PebbloSafeReader by @yograjopcito in #313
- Docker repo fix by @siddheshwar-more in #328
- Use GoogleDriveLoader from langchain_google_community by @Raj725 in #337
- Added make command to format all/changed files by @Raj725 in #340
- Update rag.md for pebblo environment variable by @rahul-trip in #348
- Update Identity and Semantic enforcement samples (PebbloRetrievalQA) by @Raj725 in #335
- Docs and version updates by @srics in #357
- Remove spell check in CI lint pipeline by @srics in #367
- Unit test trigger on tests by @siddheshwar-more in #366
- Pebblo k8s deployment by @siddheshwar-more in #362
- Topic classifier fixes - Normal text in findings, Lint errors by @Raj725 in #370
- Pebblo redirection and download report URL fix by @rutujaac in #354
- Pebblo Safe Retriever details in Local UI by @shreyas-damle in #353
- Adding python dateutil package by @dristysrivastava in #374
- Fix Local UI issues by @rutujaac in #375
- Update samples: Import PebbloRetrievalQA from langchain_community by @Raj725 in #364
- Capitalize app names for local UI by @rutujaac in #376
- source_aggr_size renamed to source_aggregate_size in loader/doc API. by @shreyas-damle in #378
- Add sort by date recent for active users and docs by @rutujaac in #377
New Contributors
- @gr8nishan made their first contribution in #309
- @yograjopcito made their first contribution in #313
Full Changelog: v0.1.14...v0.1.15
v0.1.14
Key highlights
- [Local UI / Reporting] Display authorized identities in ingested documents
- [Local UI / Reporting] Display RAG document snippet authorized identities
- Bubble chart of top semantic topics ingested into Vector DB
- Add support for Pebblo Cloud API - to display Gen-AI apps in Daxa UI
- Docker support for pebblo-server installation and deployment
- Add SafeRetriever sample with Identity and Semantic Topic enforcement
What's Changed
- Add Pebblo SafeLoader and SafeRetriever by @srics in #273
- MyPy Config Changes by @siddheshwar-more in #274
- Update SafeLoader and SafeRetriever by @srics in #276
- Fix spelling errors by @rutujaac in #284
- Pebblo Docs - Add different reports examples in Config tab by @rutujaac in #283
- Add timezone to local UI dates by @rutujaac in #287
- Updated Snippet tab in Pebblo UI by @rutujaac in #270
- Add UT for reports module by @rutujaac in #258
- Pebblo identity rag demo by @Raj725 in #290
- Added Pebblo Cloud Sample App by @shreyas-damle in #292
- Added Empty state for Reports by @rutujaac in #285
- Pebblo UI - Handle null states for objects by @rutujaac in #294
- Added readme and requirements file for pebblo cloud sample by @shreyas-damle in #295
- Update download url for pip install by @siddheshwar-more in #298
- Added changes for Docker image by @siddheshwar-more in #277
- Pebblo Docker README and Installation guide changes by @siddheshwar-more in #299
- Loader doc response changes for Pebblo Cloud initial work by @shreyas-damle in #297
- Docker config pebblo by @siddheshwar-more in #301
- Pebblo install using *.whl file download by @siddheshwar-more in #300
- Allow CORS by @dristysrivastava in #302
- Adding pebblo client as well as server version for local UI and reports by @dristysrivastava in #303
- Added pebblo version on reports and local UI by @rutujaac in #304
- Added bubble chart for findings by @rutujaac in #305
- Fixed breaking Local UI by @shreyas-damle in #310
- Fixed bubble chart render by @rutujaac in #311
- Add samples for retrieval with Identity & Semantic Enforcement by @Raj725 in #307
- Authorized Identity - backend changes to show authorized identities by @shreyas-damle in #308
- Identity UI added for local ui and reports by @rutujaac in #312
- Update bubble chart UI by @rutujaac in #317
- Updated Pebblo client version UI by @rutujaac in #316
- Upversion to 0.1.14 by @srics in #321
- Update Tooltip UI and truncate long file names in table in Local UI by @rutujaac in #320
- Docker: Update logging level to Info by @siddheshwar-more in #314
- Update bubble chart height to be dynamic by @rutujaac in #322
- Fix: single finding in bubble chart by @rutujaac in #324
Full Changelog: v0.1.13...v0.1.14
v0.1.13
Key highlights
- Anonymize snippets to redact all PII details in the generated report (both PDF and local UI)
- Improved Topic Classifier model with better F1 score (accuracy, recall) for
MEDICAL_ADVICE
,HARMFUL_ADVICE
and reduced false positives forNORMAL_TEXT
- Added support for more
Langchain
Document Loaders: Google Drive, Slack, Notion and UnstructuredEmail - Switched default PDF renderer from
weasyprint
toxhtml2pdf
- Removed out-of-the-box dependency for
pango
libraries for PDF generation. Users requiring high-fidelity PDF reports can continue to useweasyprint
andpango
. See configuration guide for more details - Local-UI enhancements
- Configuration file parsing enhancements
- Documentation updates
What's Changed
- Updated Sidebar for Pebblo Docs by @rutujaac in #217
- Topic Classifier Linting/Formatting fixes by @Raj725 in #227
- Ruff and pylint fixes for pebblo/app module by @shreyas-damle in #236
- Integration tests by @siddheshwar-more in #192
- Fixed Pebblo Integration tests by @siddheshwar-more in #252
- Update Pebblo topic classifier to version V8 by @Raj725 in #250
- Reports module formatting and linting fixes by @rutujaac in #253
- [Update] Pebblo UI documentation added. by @KumarNitin19 in #243
- Updated default config value to xhtml2pdf by @KunalJadhav5 in #215
- [Fix] Redirection for local ui implemented by @KumarNitin19 in #242
- Added validation for config and updated renderer details in config documentation. by @KunalJadhav5 in #237
- Adding document anonymizer by @dristysrivastava in #249
- Integration tests added changes to build from main by @siddheshwar-more in #255
- [Fix] Local UI load history table fix by @KumarNitin19 in #245
- Added timezone to date strings in report pdf by @rutujaac in #259
- Ruff formatting fixes by @shreyas-damle in #256
- Added file path for icons by @rutujaac in #260
- Updated Warning message . by @KunalJadhav5 in #248
- Added empty and error states for local UI by @rutujaac in #261
- Updated code to handle limited value in config.yaml by @KunalJadhav5 in #257
- Added Page not found on Pebblo UI by @rutujaac in #263
- Fix for app getting disappeared from local UI while execution is in progress by @shreyas-damle in #265
- Added pebblo UI screenshot and server details note by @rutujaac in #266
- Release 0.1.13 version update pyproject.toml by @siddheshwar-more in #268
- Updating config value anonymizeAllEntities to anonymizeSnippets by @dristysrivastava in #269
- Remove pango dependency by @shreyas-damle in #271
- Update README - remove reference to pango by @srics in #272
Full Changelog: v0.1.12...v0.1.13
v0.1.12
Key highlights
- New Local-UI to browse the reports at
http//localhost:8000/pebblo
- Enhanced Entity Classifier with context matching for improved accuracy
- PDF Report generation using xhtml2pdf (removes
pango
dependency) - Documentation updates
What's Changed
- Update Pebblo diagrams by @srics in #182
- Resolve ruff lint errors by @shreyas-damle in #171
- Config Documentation for Pebblo by @KunalJadhav5 in #166
- Update load history template by @rutujaac in #174
- Changes for mypy ruff linting by @siddheshwar-more in #190
- Added details for Load History section in report. by @shreyas-damle in #181
- Adding documentation for Pebblo Topic and Entity Classifier by @rohiniNN in #186
- Update xhtml2pdf report template by @rutujaac in #188
- Rename files to documents in weasyprintTemplate.html by @rutujaac in #193
- Changing Pebblo Daemon to Pebblo server and fixing previous PR bug by @rohiniNN in #197
- fixing report link in document pages by @rohiniNN in #201
- Added config documentation in sidebar. by @KunalJadhav5 in #198
- Pebblo Docs - Remove unused md files by @rutujaac in #194
- Pebblo CI tests by @siddheshwar-more in #202
- [Entity Classifier] Enhance Entity Classifier using words context (Presidio) by @dristysrivastava in #204
- Updating scores for entity classifier for context matching by @dristysrivastava in #210
- Update field in xhtml2pdf template for findings by @rutujaac in #211
- Feature local UI by @shreyas-damle in #222
- Fixed swapped findings entities and findings topics counts. by @shreyas-damle in #238
- [Fix] data source label value fixed by @KumarNitin19 in #239
- [Fix] local ui app details page snippet count fixed by @KumarNitin19 in #240
- [Fix] search fixed for table local ui by @KumarNitin19 in #241
- Upversion to 0.1.12 by @srics in #246
New Contributors
Full Changelog: v0.1.11...v0.1.12
v0.1.11
New features
- Pebblo Safe DataLoader is now upstreamed, built-in to the official Langchain release (versions
>=0.1.7
) - Load history
- Configuration support
- xhtml2pdf renderer (alternative to weasyprint/pango renderer)
- Improved Topic classifier behavior for small chunk sizes
- Improved terminal logging with progress bars
- Documentation improvements
What's Changed
- Rutuja gh pages docs by @rutujaac in #121
- Rutuja update deploy gh pages by @rutujaac in #122
- Delete .github/workflows/jekyll-gh-pages.yml by @sid-cd-daxa in #123
- Update gh page workflow by @sid-cd-daxa in #131
- TestPy Version update with git commit sha by @siddheshwar-more in #134
- Update build.yml by @siddheshwar-more in #135
- Updated TestPyPi workflow by @siddheshwar-more in #136
- Update README and workflow to use new doc location by @srics in #126
- Fix docusaurus build by @rutujaac in #145
- Config.yaml Support for Pebblo by @KunalJadhav5 in #137
- Log improvements in pebblo-server. by @rahul-trip in #124
- xhtml2pdf report template by @rutujaac in #140
- Update 'Edit Page' URL by @rutujaac in #132
- Updated topic list & Fixed min input length condition by @Raj725 in #138
- Update samples to use upstream langchain by @srics in #152
- Use uvicorn logger and handle output dir location given in config.yaml file by @shreyas-damle in #150
- log further improvements. by @rahul-trip in #153
- Create troubleshooting.md by @rahul-trip in #154
- fix review comments in PR: 154 by @rahul-trip in #157
- Update README for langchain pebblo upstream release by @srics in #158
- Skipped printing config details at the startup by @shreyas-damle in #159
- Reading report details from config.yaml by @KunalJadhav5 in #148
- Updated git workflow for multiple python version by @siddheshwar-more in #155
- Spellcheck on docs by @rahul-trip in #161
- Python version fix for unit tests by @siddheshwar-more in #165
- suppress progress bar in --help option. by @rahul-trip in #164
- Pebblo Issue template by @siddheshwar-more in #133
- Removed unused param by @shreyas-damle in #167
- Update issue templates by @siddheshwar-more in #168
- updated the linting workflow by @siddheshwar-more in #170
- Feature app history by @shreyas-damle in #151
- Update version 0.1.10 -> 0.1.11 by @Raj725 in #172
New Contributors
- @sid-cd-daxa made their first contribution in #123
Full Changelog: v0.1.10...v0.1.11