Merge branch 'main' into fix-node-string-print

run-llama · Sep 23, 2024 · b8e9fe9 · b8e9fe9
2 parents f6b21d3 + f037527
commit b8e9fe9
Show file tree

Hide file tree

Showing 116 changed files with 3,367 additions and 1,093 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,82 @@
 # ChangeLog
 
+## [2024-09-22]
+
+### `llama-index-core` [0.11.12]
+
+- Correct Pydantic warning(s) issed for llm base class (#16141)
+- globally safe format prompt variables in strings with JSON (#15734)
+- account for tools in prompt helper and response synthesizers (#16157)
+
+### `llama-index-readers-google` [0.4.1]
+
+- feat: add drive link to google drive reader metadata (#16156)
+
+### `llama-index-readers-microsoft-sharepoint` [0.3.2]
+
+- Add required_exts option to SharePoint reader (#16152)
+
+### `llama-index-vector-stores-milvus` [0.2.4]
+
+- Support user-defined schema in MilvusVectorStore (#16151)
+
+## [2024-09-20]
+
+### `llama-index-core` [0.11.11]
+
+- Use response synthesizer in context chat engines (#16017)
+- Async chat memory operation (#16127)
+- Sql query add option for markdown response (#16103)
+- Add support for Path for SimpleDirectoryReader (#16108)
+- Update chat message class for multi-modal (#15969)
+- fix: `handler.stream_events()` doesn't yield StopEvent (#16115)
+- pass `hybrid_top_k` in vector retriever (#16105)
+
+### `llama-index-embeddings-elasticsearch` [0.2.1]
+
+- fix elasticsearch embedding async function (#16083)
+
+### `llama-index-embeddings-jinaai` [0.3.1]
+
+- feat: update JinaEmbedding for v3 release (#15971)
+
+### `llama-index-experimental` [0.3.3]
+
+- Enhance Pandas Query Engine Output Processor (#16052)
+
+### `llama-index-indices-managed-vertexai` [0.1.1]
+
+- fix incorrect parameters in VertexAIIndex client (#16080)
+
+### `llama-index-node-parser-topic` [0.1.0]
+
+- Add TopicNodeParser based on MedGraphRAG paper (#16131)
+
+### `llama-index-multi-modal-llms-ollama` [0.3.2]
+
+- Implement async for multi modal ollama (#16091)
+
+### `llama-index-postprocessor-cohere-rerank` [0.2.1]
+
+- feat: add configurable base_url field in rerank (#16050)
+
+### `llama-index-readers-file` [0.2.2]
+
+- fix bug missing import for bytesio (#16096)
+
+### `llama-index-readers-wordpress` [0.2.2]
+
+- Wordpress: Allow control of whether Pages and/or Posts are retrieved (#16128)
+- Fix Issue 16071: wordpress requires username, password (#16072)
+
+### `llama-index-vector-stores-lancedb` [0.2.1]
+
+- fix hybrid search with latest lancedb client (#16057)
+
+### `llama-index-vector-stores-mongodb` [0.3.0]
+
+- Fix mongodb hybrid search top-k specs (#16105)
+
 ## [2024-09-16]
 
 ### `llama-index-core` [0.11.10]

diff --git a/docs/docs/CHANGELOG.md b/docs/docs/CHANGELOG.md
@@ -1,5 +1,82 @@
 # ChangeLog
 
+## [2024-09-22]
+
+### `llama-index-core` [0.11.12]
+
+- Correct Pydantic warning(s) issed for llm base class (#16141)
+- globally safe format prompt variables in strings with JSON (#15734)
+- account for tools in prompt helper and response synthesizers (#16157)
+
+### `llama-index-readers-google` [0.4.1]
+
+- feat: add drive link to google drive reader metadata (#16156)
+
+### `llama-index-readers-microsoft-sharepoint` [0.3.2]
+
+- Add required_exts option to SharePoint reader (#16152)
+
+### `llama-index-vector-stores-milvus` [0.2.4]
+
+- Support user-defined schema in MilvusVectorStore (#16151)
+
+## [2024-09-20]
+
+### `llama-index-core` [0.11.11]
+
+- Use response synthesizer in context chat engines (#16017)
+- Async chat memory operation (#16127)
+- Sql query add option for markdown response (#16103)
+- Add support for Path for SimpleDirectoryReader (#16108)
+- Update chat message class for multi-modal (#15969)
+- fix: `handler.stream_events()` doesn't yield StopEvent (#16115)
+- pass `hybrid_top_k` in vector retriever (#16105)
+
+### `llama-index-embeddings-elasticsearch` [0.2.1]
+
+- fix elasticsearch embedding async function (#16083)
+
+### `llama-index-embeddings-jinaai` [0.3.1]
+
+- feat: update JinaEmbedding for v3 release (#15971)
+
+### `llama-index-experimental` [0.3.3]
+
+- Enhance Pandas Query Engine Output Processor (#16052)
+
+### `llama-index-indices-managed-vertexai` [0.1.1]
+
+- fix incorrect parameters in VertexAIIndex client (#16080)
+
+### `llama-index-node-parser-topic` [0.1.0]
+
+- Add TopicNodeParser based on MedGraphRAG paper (#16131)
+
+### `llama-index-multi-modal-llms-ollama` [0.3.2]
+
+- Implement async for multi modal ollama (#16091)
+
+### `llama-index-postprocessor-cohere-rerank` [0.2.1]
+
+- feat: add configurable base_url field in rerank (#16050)
+
+### `llama-index-readers-file` [0.2.2]
+
+- fix bug missing import for bytesio (#16096)
+
+### `llama-index-readers-wordpress` [0.2.2]
+
+- Wordpress: Allow control of whether Pages and/or Posts are retrieved (#16128)
+- Fix Issue 16071: wordpress requires username, password (#16072)
+
+### `llama-index-vector-stores-lancedb` [0.2.1]
+
+- fix hybrid search with latest lancedb client (#16057)
+
+### `llama-index-vector-stores-mongodb` [0.3.0]
+
+- Fix mongodb hybrid search top-k specs (#16105)
+
 ## [2024-09-16]
 
 ### `llama-index-core` [0.11.10]

diff --git a/docs/docs/api_reference/node_parser/topic.md b/docs/docs/api_reference/node_parser/topic.md
@@ -0,0 +1,4 @@
+::: llama_index.node_parser.topic
+    options:
+      members:
+        - TopicNodeParser
diff --git a/docs/docs/examples/cookbooks/cleanlab_tlm_rag.ipynb b/docs/docs/examples/cookbooks/cleanlab_tlm_rag.ipynb
@@ -126,7 +126,7 @@
     {
      "data": {
       "text/plain": [
-       "{'trustworthiness_score': 0.9884869430083446}"
+       "{'trustworthiness_score': 0.9884868983475051}"
       ]
      },
      "execution_count": null,
@@ -249,17 +249,16 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "from typing import Dict, List\n",
-    "\n",
+    "from typing import Dict, List, ClassVar\n",
     "from llama_index.core.instrumentation.events import BaseEvent\n",
     "from llama_index.core.instrumentation.event_handlers import BaseEventHandler\n",
     "from llama_index.core.instrumentation import get_dispatcher\n",
     "from llama_index.core.instrumentation.events.llm import LLMCompletionEndEvent\n",
     "\n",
     "\n",
     "class GetTrustworthinessScore(BaseEventHandler):\n",
-    "    events: List[BaseEvent] = []\n",
-    "    trustworthiness_score = 0\n",
+    "    events: ClassVar[List[BaseEvent]] = []\n",
+    "    trustworthiness_score: float = 0.0\n",
     "\n",
     "    @classmethod\n",
     "    def class_name(cls) -> str:\n",
@@ -356,7 +355,7 @@
      "output_type": "stream",
      "text": [
       "Response: The GAAP earnings per diluted share for the quarter (Q1 FY24) was $0.82.\n",
-      "Trustworthiness score: 1.0\n"
+      "Trustworthiness score: 0.99\n"
      ]
     }
    ],
@@ -448,7 +447,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The lower TLM trustworthiness score indicate a bit more uncertainty about the response, which aligns with the lack of information available. Let's try some more questions."
+    "The lower TLM trustworthiness score indicates a bit more uncertainty about the response, which aligns with the lack of information available. Let's try some more questions."
    ]
   },
   {
@@ -460,7 +459,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Response: The report indicates that NVIDIA's Gaming revenue decreased year over year by 38%, which is attributed to a combination of factors, including a challenging market environment and possibly reduced demand for gaming hardware. While specific reasons for the decline are not detailed in the provided information, the overall decrease in revenue suggests that the gaming sector is facing headwinds compared to the previous year.\n",
+      "Response: The report does not explicitly explain the reasons for the year-over-year decrease in NVIDIA's Gaming revenue. However, it does provide context regarding the overall performance of the gaming segment, noting that first-quarter revenue was $2.24 billion, which is down 38% from a year ago but up 22% from the previous quarter. This suggests that while there may have been a decline compared to the same period last year, there was a recovery compared to the previous quarter. Factors that could contribute to the year-over-year decline might include market conditions, competition, or changes in consumer demand, but these specifics are not detailed in the report.\n",
       "Trustworthiness score: 0.92\n"
      ]
     }
@@ -481,8 +480,8 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Response: The provided context information does not include any details about NVIDIA's dividend payout or the industry average for dividends. Therefore, I cannot provide a comparison of NVIDIA's dividend payout for this quarter to the industry average. Additional information regarding dividends would be needed to answer the query.\n",
-      "Trustworthiness score: 0.87\n"
+      "Response: The context information provided does not include specific details about the industry average for dividend payouts. Therefore, I cannot directly compare NVIDIA's dividend payout for this quarter to the industry average. However, NVIDIA announced a quarterly cash dividend of $0.04 per share for shareholders of record on June 8, 2023. To assess how this compares to the industry average, one would need to look up the average dividend payout for similar companies in the technology or semiconductor industry.\n",
+      "Trustworthiness score: 0.93\n"
      ]
     }
    ],
@@ -566,6 +565,94 @@
     "display_response(response)"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Response: In NVIDIA's Q1 FY2024 financial results, the following RTX GPU models were officially announced:\n",
+      "\n",
+      "1. **GeForce RTX 4060 family of GPUs**\n",
+      "2. **GeForce RTX 4070 GPU**\n",
+      "3. **Six new NVIDIA RTX GPUs for mobile and desktop workstations**\n",
+      "\n",
+      "This totals to **eight RTX GPU models** announced.\n",
+      "Trustworthiness score: 0.74\n"
+     ]
+    }
+   ],
+   "source": [
+    "response = query_engine.query(\n",
+    "    \"How many RTX GPU models, including all custom versions released by third-party manufacturers and all revisions across different series, were officially announced in NVIDIA's Q1 FY2024 financial results?\",\n",
+    ")\n",
+    "display_response(response)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Response: To calculate the projected annual revenue for NVIDIA's Data Center segment if it maintains its Q1 FY2024 quarter-over-quarter growth rate, we first need to determine the growth rate from Q4 FY2023 to Q1 FY2024.\n",
+      "\n",
+      "NVIDIA reported a record Data Center revenue of $4.28 billion for Q1 FY2024. The revenue for the previous quarter (Q4 FY2023) can be calculated as follows:\n",
+      "\n",
+      "Let \\( R \\) be the revenue for Q4 FY2023. The growth rate from Q4 FY2023 to Q1 FY2024 is given by:\n",
+      "\n",
+      "\\[\n",
+      "\\text{Growth Rate} = \\frac{\\text{Q1 Revenue} - \\text{Q4 Revenue}}{\\text{Q4 Revenue}} = \\frac{4.28 - R}{R}\n",
+      "\\]\n",
+      "\n",
+      "We know that the overall revenue for Q1 FY2024 is $7.19 billion, which is up 19% from the previous quarter. Therefore, we can express the revenue for Q4 FY2023 as:\n",
+      "\n",
+      "\\[\n",
+      "\\text{Q1 FY2024 Revenue} = \\text{Q4 FY2023 Revenue} \\times 1.19\n",
+      "\\]\n",
+      "\n",
+      "Substituting the known value:\n",
+      "\n",
+      "\\[\n",
+      "7.19 = R \\times 1.19\n",
+      "\\]\n",
+      "\n",
+      "Solving for \\( R \\):\n",
+      "\n",
+      "\\[\n",
+      "R = \\frac{7.19}{1.19} \\approx 6.03 \\text{ billion}\n",
+      "\\]\n",
+      "\n",
+      "Now, we can calculate the Data Center revenue for Q4 FY2023. Since we don't have the exact figure for the Data Center revenue in Q4 FY2023, we will assume that the Data Center revenue also grew by the same percentage as the overall revenue. \n",
+      "\n",
+      "Now, we can calculate the quarter-over-quarter growth rate for the Data Center segment:\n",
+      "\n",
+      "\\[\n",
+      "\\text{Growth Rate} = \\frac{4.28 - R_D}{R_D}\n",
+      "\\]\n",
+      "\n",
+      "Where \\( R_D \\) is the Data Center revenue for Q4 FY2023. However, we need to find \\( R_D \\) first. \n",
+      "\n",
+      "Assuming the Data Center revenue was a certain percentage of the total revenue in Q4 FY2023, we can estimate it. For simplicity, let's assume the Data Center revenue was around 50% of the total revenue in Q4 FY2023 (this is a rough estimate, as we don't have the exact figure).\n",
+      "\n",
+      "Thus, \\( R_D \\approx 0.5 \\times 6\n",
+      "Trustworthiness score: 0.69\n"
+     ]
+    }
+   ],
+   "source": [
+    "response = query_engine.query(\n",
+    "    \"If NVIDIA's Data Center segment maintains its Q1 FY2024 quarter-over-quarter growth rate for the next four quarters, what would be its projected annual revenue?\",\n",
+    ")\n",
+    "display_response(response)"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -574,7 +661,11 @@
     "\n",
     "> NVIDIA's revenue increased by $1.14 billion this quarter compared to last quarter.\n",
     "\n",
-    "> Google, Amazon Web Services, Microsoft, Oracle, ServiceNow, Medtronic, Dell Technologies\n",
+    "> Google, Amazon Web Services, Microsoft, Oracle, ServiceNow, Medtronic, Dell Technologies.\n",
+    "\n",
+    "> There is not a specific total count of RTX GPUs mentioned.\n",
+    "\n",
+    "> Projected annual revenue if this growth rate is maintained for the next four quarters: approximately $26.34 billion.\n",
     "\n",
     "With TLM, you can easily increase trust in any RAG system! \n",
     "\n",

diff --git a/docs/docs/examples/embeddings/voyageai.ipynb b/docs/docs/examples/embeddings/voyageai.ipynb
@@ -59,7 +59,7 @@
    "source": [
     "# get API key and create embeddings\n",
     "\n",
-    "model_name = \"voyage-law-2\"  # Please check https://docs.voyageai.com/docs/embeddings for the available models\n",
+    "model_name = \"voyage-3\"  # Please check https://docs.voyageai.com/docs/embeddings for the available models\n",
     "voyage_api_key = os.environ.get(\"VOYAGE_API_KEY\", \"your-api-key\")\n",
     "\n",
     "embed_model = VoyageEmbedding(\n",