Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update fix/analytics data export cleanup #3816

Conversation

NicholasTurner23
Copy link
Contributor

@NicholasTurner23 NicholasTurner23 commented Nov 7, 2024

Description

  1. This PR add a flag that can reduce the data being exported.
  2. Adds doc strings to the data export endpoint.
  3. A bit of optimization aiming for code readability.

Related Issues

  • JIRA cards:
    • OPS-309

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced new functions for filtering non-private sites and devices, enhancing data privacy.
    • Added a new HTTP request handler class that improves error handling and logging.
  • Bug Fixes

    • Refined error handling in various resource classes to provide clearer feedback and prevent potential issues.
  • Refactor

    • Streamlined response handling across multiple resource classes to ensure consistency and clarity.
    • Updated validation logic for input parameters in data export resources.
  • Documentation

    • Improved clarity in error messages for better user understanding.

Copy link
Contributor

coderabbitai bot commented Nov 7, 2024

📝 Walkthrough
📝 Walkthrough

Walkthrough

The pull request introduces several modifications across various files in the analytics API. Key changes include the refactoring of query construction in the EventsModel to avoid duplicate columns, enhancements to data filtering functions in data_formatters.py, and the introduction of a new AirQoRequests class for improved HTTP request handling. Additionally, response handling in multiple resource classes has been standardized to utilize this new class, ensuring consistency across the API. Overall, these changes refine existing functionalities and improve code clarity and maintainability.

Changes

File Path Change Summary
src/analytics/api/models/events.py Modified EventsModel class; commented out device ID addition in data_export_query to prevent duplicate columns.
src/analytics/api/utils/data_formatters.py Enhanced filtering functions: replaced filter_non_private_entities with filter_non_private_sites and filter_non_private_devices. Improved type annotations and error handling with logging.
src/analytics/api/utils/http.py Introduced AirQoRequests class for HTTP requests with retry logic; nested Status class for status code handling. Added request method for handling various HTTP methods and improved error handling.
src/analytics/api/utils/request_validators.py Updated imports to use AirQoRequests for status code references; modified status code handling in validate_params.
src/analytics/api/views/dashboard.py Replaced filtering functions with new specific functions; standardized response handling using AirQoRequests.create_response.
src/analytics/api/views/data.py Enhanced DataExportResource and DataExportV2Resource classes with additional parameter validation and updated response handling using AirQoRequests.
src/analytics/api/views/report.py Standardized response handling across multiple resource classes using AirQoRequests; updated error messages for clarity.

Possibly related PRs

Suggested labels

ready for review

Suggested reviewers

  • Baalmart
  • BenjaminSsempala
  • AnthonyByansi

Poem

In the code where queries flow,
A tweak to keep the dataset aglow.
Functions refined, clarity found,
With AirQoRequests, responses abound.
Errors logged, no more a mess,
Here’s to the changes, we all progress! 🎉


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Outside diff range and nitpick comments (26)
src/analytics/api/views/report.py (4)

41-44: LGTM with a minor suggestion

The response handling is well-structured, but consider making the error message more specific:

-                "A default template already exist",
+                "A default template already exists for this tenant",

112-113: Consider enhancing error messages for better clarity

While the response handling is correct, the error messages could be more descriptive:

-                "The update fields are empty. valid keys are {valid_keys}"
+                f"No valid update fields provided. Allowed fields are: {', '.join(valid_keys)}"
-                "could not update default template"
+                "Default template not found or no changes were made"

Also applies to: 123-126, 130-133


226-230: Enhance response messages for better context

While the response handling is correct, consider making the messages more specific:

-                    f"the update fields is empty. valid keys are {valid_keys}"
+                    f"No valid fields provided for update. Allowed fields are: {', '.join(valid_keys)}"

-                "report updated successfully"
+                f"Monthly report '{report_name}' updated successfully"

-                "report not found"
+                f"Monthly report '{report_name}' not found"

Also applies to: 241-242, 246-247, 260-263, 267-268


Line range hint 1-284: Excellent API structure with room for enhancement

The codebase demonstrates good REST API practices with consistent response handling and thorough documentation. Consider these enhancements:

  1. Add request/response logging for better observability
  2. Consider implementing request ID tracking for better debugging
  3. Add rate limiting for the API endpoints
src/analytics/api/utils/data_formatters.py (2)

2-3: Clean up imports and enhance logging setup.

Good addition of type hints and logging setup! However, there's an unused import that should be removed.

-from config import Config

Also applies to: 17-18, 20-21


297-324: Consider enhancing error handling specificity.

The implementation looks good with proper documentation and logging. However, consider catching specific exceptions from the HTTP request and providing more detailed error handling.

 try:
     airqo_requests = AirQoRequests()
     response = airqo_requests.request(
         endpoint=endpoint, body={"sites": sites}, method="post"
     )
     if response and response.get("status", None) == "success":
         return response.get("data")
     else:
         raise RuntimeError(response.get("message"))
-except RuntimeError as rex:
+except (requests.RequestException, RuntimeError) as ex:
-    logger.exception(f"Error while filtering non private entities {rex}")
+    logger.exception("Error while filtering non private sites: %s", str(ex))
+    return {}
src/analytics/api/views/dashboard.py (2)

211-212: Consider caching for frequently accessed daily averages.

While the implementation is correct, consider adding caching for the daily averages data, especially if this endpoint experiences high traffic.

Also applies to: 248-256


275-276: Consider consolidating duplicate averaging logic.

There appears to be significant overlap between DailyAveragesResource and DailyAveragesResource2. Consider creating a base class to share common functionality.

class BaseDailyAveragesResource(Resource):
    def _process_averages(self, data):
        values = []
        labels = []
        background_colors = []
        # Common processing logic here
        return values, labels, background_colors

Also applies to: 298-306

src/analytics/api/models/events.py (4)

195-195: Improve the inline comment for better clarity.

The comment correctly identifies the duplicate column issue, but could be more descriptive.

Consider updating the comment to be more specific:

-# f" {devices_table}.device_id AS device_name , " #this column  creates a duplicate column
+# f" {devices_table}.device_id AS device_name , "  # Removed to prevent duplicate device_name column as it's already selected in the outer query

Line range hint 32-32: Consider adding cache TTL for time-sensitive data.

The @cache.memoize() decorator is used effectively for expensive BigQuery operations, but there's no visible cache invalidation strategy. For time-sensitive data, especially in methods like get_downloadable_events, consider adding a Time-To-Live (TTL) to prevent serving stale data.

Consider adding TTL parameter to the cache decorator for methods handling time-series data:

@cache.memoize(timeout=3600)  # 1 hour TTL

Also applies to: 583-583, 669-669, 778-778, 825-825, 891-891


Line range hint 32-32: Implement configurable export limit as mentioned in PR objectives.

The PR objectives mention adding a flag to control export data volume, but the implementation uses a fixed DATA_EXPORT_LIMIT. Consider making this limit configurable via a parameter.

 @classmethod
 @cache.memoize()
-def download_from_bigquery(cls, devices, sites, airqlouds, start_date, end_date, frequency, pollutants, weather_fields):
+def download_from_bigquery(cls, devices, sites, airqlouds, start_date, end_date, frequency, pollutants, weather_fields, limit=None):
     decimal_places = cls.DATA_EXPORT_DECIMAL_PLACES
+    export_limit = limit if limit is not None else cls.DATA_EXPORT_LIMIT
     # ... rest of the method ...
     dataframe = (
         bigquery.Client()
-        .query(f"select distinct * from ({query}) limit {cls.DATA_EXPORT_LIMIT}", job_config)
+        .query(f"select distinct * from ({query}) limit {export_limit}", job_config)
         .result()
         .to_dataframe()
     )

Also applies to: 276-276


Line range hint 271-278: Enhance error handling for BigQuery operations.

The BigQuery client operations lack explicit error handling, which could lead to unclear error messages or unhandled exceptions in case of quota limits, permission issues, or network problems.

Consider wrapping BigQuery operations with proper error handling:

+from google.api_core import retry
+from google.cloud import exceptions as google_exceptions

     dataframe = (
-        bigquery.Client()
-        .query(f"select distinct * from ({query}) limit {cls.DATA_EXPORT_LIMIT}", job_config)
-        .result()
-        .to_dataframe()
+        try:
+            return (
+                bigquery.Client()
+                .query(
+                    f"select distinct * from ({query}) limit {cls.DATA_EXPORT_LIMIT}",
+                    job_config,
+                    retry=retry.Retry(deadline=30)
+                )
+                .result()
+                .to_dataframe()
+            )
+        except google_exceptions.GoogleAPIError as e:
+            raise Exception(f"BigQuery operation failed: {str(e)}")
     )
src/analytics/api/utils/http.py (4)

14-15: Enhance the class docstring for AirQoRequests

Consider providing more detailed documentation in the class docstring to explain how to use AirQoRequests and its methods.


17-99: Reconsider nesting the Status class within AirQoRequests

If the Status codes are needed elsewhere in the codebase, nesting Status within AirQoRequests may limit their accessibility. Consider placing Status at the module level for better reusability.


161-161: Correct the typo in the success message

There's a typo in the success message: "Success fully returned request data". It should be "Successfully returned request data".

Suggested change:

return self.create_response(
-    "Success fully returned request data",
+    "Successfully returned request data",
    data=simplejson.loads(response.data),
    success=True,
)

185-190: Include HTTP status code in the response

Adding an HTTP status code to the response dictionary can help callers handle responses more effectively.

Suggested change:

def create_response(message, data=None, success=True):
    response = {
        "status": "success" if success else "error",
        "message": message,
        "data": data,
+       "status_code": self.Status.HTTP_200_OK if success else self.Status.HTTP_400_BAD_REQUEST
    }
    return response
src/analytics/api/views/data.py (10)

115-115: Remove unnecessary f prefix in string

The string on line 115 does not contain any placeholders. The f prefix is unnecessary and can be removed.

Apply this diff to remove the extraneous f prefix:

-                        f"Specify either a list of airqlouds, sites or devices in the request body",
+                        "Specify either a list of airqlouds, sites or devices in the request body",
🧰 Tools
🪛 Ruff

115-115: f-string without any placeholders

Remove extraneous f prefix

(F541)


124-124: Remove unnecessary f prefix in string

The string on line 124 does not contain any placeholders. The f prefix is unnecessary and can be removed.

Apply this diff to remove the extraneous f prefix:

-                        f"You cannot specify airqlouds, sites and devices in one go",
+                        "You cannot specify airqlouds, sites and devices in one go",
🧰 Tools
🪛 Ruff

124-124: f-string without any placeholders

Remove extraneous f prefix

(F541)


193-193: Remove unnecessary f prefix in string

The string on line 193 does not contain any placeholders. The f prefix is unnecessary and can be removed.

Apply this diff to remove the extraneous f prefix:

-                        f"An Error occurred while processing your request. Please contact support",
+                        "An Error occurred while processing your request. Please contact support",
🧰 Tools
🪛 Ruff

193-193: f-string without any placeholders

Remove extraneous f prefix

(F541)


243-247: Ensure consistent default values for sites and devices

In the DataExportV2Resource class, the default values for sites and devices are set to None, whereas in the DataExportResource class, they are set to empty lists. For consistency and to prevent potential issues when accessing these variables, consider setting the default values to empty lists in both cases.

Apply this diff to update the default values:

             sites = filter_non_private_sites(sites=json_data.get("sites", {})).get(
-                "sites", None
+                "sites", []
             )

             devices = filter_non_private_devices(devices=json_data.get("devices", {})).get(
-                "devices", None
+                "devices", []
             )

272-272: Remove unnecessary f prefix in string

The string on line 272 does not contain any placeholders. The f prefix is unnecessary and can be removed.

Apply this diff to remove the extraneous f prefix:

-                        f"Specify either a list of airqlouds, sites or devices in the request body",
+                        "Specify either a list of airqlouds, sites or devices in the request body",
🧰 Tools
🪛 Ruff

272-272: f-string without any placeholders

Remove extraneous f prefix

(F541)


351-351: Remove unnecessary f prefix in string

The string on line 351 does not contain any placeholders. The f prefix is unnecessary and can be removed.

Apply this diff to remove the extraneous f prefix:

-                        f"An Error occurred while processing your request. Please contact support",
+                        "An Error occurred while processing your request. Please contact support",
🧰 Tools
🪛 Ruff

351-351: f-string without any placeholders

Remove extraneous f prefix

(F541)


381-381: Remove unnecessary f prefix in string

The string on line 381 does not contain any placeholders. The f prefix is unnecessary and can be removed.

Apply this diff to remove the extraneous f prefix:

-                        f"An Error occurred while processing your request. Please contact support",
+                        "An Error occurred while processing your request. Please contact support",
🧰 Tools
🪛 Ruff

381-381: f-string without any placeholders

Remove extraneous f prefix

(F541)


408-408: Remove unnecessary f prefix in string

The string on line 408 does not contain any placeholders. The f prefix is unnecessary and can be removed.

Apply this diff to remove the extraneous f prefix:

-                        f"An Error occurred while processing your request. Please contact support",
+                        "An Error occurred while processing your request. Please contact support",
🧰 Tools
🪛 Ruff

408-408: f-string without any placeholders

Remove extraneous f prefix

(F541)


69-76: Enhance method documentation

The docstring for the post method provides valuable information, but it could include details about the new parameters datatype and minimum to improve clarity for future developers.


130-136: Improve error message specificity for invalid pollutants

The error message on line 133 currently states "Invalid pollutant." Consider specifying which pollutant is invalid to provide clearer feedback to the user.

Apply this diff to include the invalid pollutant in the message:

-                        f"Invalid pollutant. Valid values are {', '.join(valid_options['pollutants'])}",
+                        f"Invalid pollutant {p}. Valid values are {', '.join(valid_options['pollutants'])}",
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 9470a9e and b4c0474.

📒 Files selected for processing (7)
  • src/analytics/api/models/events.py (1 hunks)
  • src/analytics/api/utils/data_formatters.py (4 hunks)
  • src/analytics/api/utils/http.py (1 hunks)
  • src/analytics/api/utils/request_validators.py (2 hunks)
  • src/analytics/api/views/dashboard.py (15 hunks)
  • src/analytics/api/views/data.py (12 hunks)
  • src/analytics/api/views/report.py (11 hunks)
🧰 Additional context used
🪛 Ruff
src/analytics/api/utils/data_formatters.py

18-18: config.Config imported but unused

Remove unused import: config.Config

(F401)

src/analytics/api/views/data.py

107-107: Local variable data_type is assigned to but never used

Remove assignment to unused variable data_type

(F841)


115-115: f-string without any placeholders

Remove extraneous f prefix

(F541)


124-124: f-string without any placeholders

Remove extraneous f prefix

(F541)


193-193: f-string without any placeholders

Remove extraneous f prefix

(F541)


272-272: f-string without any placeholders

Remove extraneous f prefix

(F541)


351-351: f-string without any placeholders

Remove extraneous f prefix

(F541)


381-381: f-string without any placeholders

Remove extraneous f prefix

(F541)


408-408: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (12)
src/analytics/api/utils/request_validators.py (1)

10-10: Clean import refactoring!

The switch to AirQoRequests aligns well with the codebase's move towards a more encapsulated HTTP request handling approach.

src/analytics/api/views/report.py (3)

12-12: LGTM: Clean import addition

The addition of AirQoRequests import is well-placed and consistently used throughout the file for standardized HTTP response handling.


89-92: LGTM: Well-structured response

The response format properly separates the message from the data payload, making it clear and consistent with REST API best practices.


163-164: LGTM: Consistent response handling

The response handling follows a consistent pattern across all endpoints, with appropriate status codes and clear messages.

Also applies to: 196-199, 203-204

src/analytics/api/utils/data_formatters.py (1)

201-202: LGTM! Improved type hints enhance code clarity.

The updated type hints using List and Union provide better clarity about the expected input and output types.

src/analytics/api/views/dashboard.py (4)

14-17: LGTM! Well-organized imports for data privacy functions.

The addition of specific filtering functions for sites and devices indicates a good separation of concerns and enhanced data privacy controls.


20-20: LGTM! Good centralization of HTTP response handling.

The introduction of AirQoRequests centralizes response creation, promoting consistency across the API.


152-153: LGTM! Consistent implementation with ChartDataResource.

The changes maintain a consistent pattern of privacy filtering and response handling across chart endpoints.

Also applies to: 171-174


46-47: Verify consistent filtering implementation across chart endpoints.

The implementation looks good, but let's verify that all chart-related endpoints consistently apply the privacy filtering.

Also applies to: 129-133

✅ Verification successful

Let me gather more context about the implementation details of these endpoints to ensure they handle filtering consistently.


Let me try one more search with modified patterns to better understand the implementation details.


Privacy filtering is consistently implemented across all chart endpoints

Based on the comprehensive analysis, all chart-related endpoints in both dashboard.py and data.py consistently implement the filter_non_private_sites function for site data filtering. The implementation is uniform across:

  • ChartDataResource
  • D3ChartDataResource
  • All other endpoints handling site data
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for chart-related endpoints and verify consistent filtering
rg -A 5 "class.*ChartDataResource" src/analytics/api/views/
# Search for usage of filter_non_private_sites in chart endpoints
rg "filter_non_private_sites.*sites" src/analytics/api/views/

Length of output: 1731


Script:

#!/bin/bash
# Check the implementation of chart-related endpoints
ast-grep --pattern 'class ChartDataResource {
  $$$
  post($$$) {
    $$$
  }
  $$$
}'

ast-grep --pattern 'class D3ChartDataResource {
  $$$
  post($$$) {
    $$$
  }
  $$$
}'

# Check if there are any other chart-related resources
rg -B 2 -A 5 "class.*Resource.*Chart" src/analytics/api/views/

# Look for any other endpoints that might handle site data
rg -A 3 '"sites".*required.*list' src/analytics/api/views/

Length of output: 305


Script:

#!/bin/bash
# Search for chart endpoint implementations with more context
rg -B 2 -A 10 "def post" src/analytics/api/views/dashboard.py

# Look for any site filtering patterns in the codebase
rg -B 2 -A 5 "filter_non_private_sites" src/analytics/api/views/

# Check for any other site-related filtering functions
rg -A 3 "def.*filter.*site" src/analytics/api/

Length of output: 7586

src/analytics/api/utils/http.py (2)

129-132: Review the retry strategy configuration

The current retry strategy uses a backoff_factor of 5, which can lead to long delays between retries (e.g., 0s, 5s, 25s, 125s, etc.). Verify if this is intentional, or consider adjusting the backoff_factor and total retries for optimal performance.


146-146: Verify handling of NaN values in the request body

Using simplejson.dumps with ignore_nan=True replaces NaN values with null. Ensure that the API can handle null values appropriately, or sanitize the data before serialization.

src/analytics/api/views/data.py (1)

155-157: Verify the use of HTTP 404 status code when no data is found

When no data is found, the API returns a 404 Not Found status code. Typically, a 200 OK status with an empty data set or a 204 No Content status might be more appropriate. Returning 404 could imply that the endpoint itself is invalid.

If you'd like to verify standard practices, consider reviewing REST API guidelines on status codes for no content scenarios.

@@ -213,7 +213,7 @@ def decorated(*args, **kwargs):
"status": "error",
"message": "an error occurred while processing this request",
"errors": errors,
}, Status.HTTP_400_BAD_REQUEST
}, AirQoRequests.Status.HTTP_400_BAD_REQUEST
Copy link
Contributor

@coderabbitai coderabbitai bot Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Inconsistent status code reference patterns detected across modules

The verification reveals that while the analytics module has adopted AirQoRequests.Status, other modules (insights and device-monitoring) are still using direct Status references. Here's where updates are needed:

  • src/insights/namespaces/: Multiple files (report.py, data.py, dashboard.py) using direct Status references
  • src/device-monitoring/helpers/request_validators.py: Using direct Status reference
🔗 Analysis chain

Consistent status code reference update!

The change to AirQoRequests.Status.HTTP_400_BAD_REQUEST maintains the same semantics while aligning with the new standardized response handling approach.

Let's verify that this status code reference pattern is consistently applied across the codebase:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining direct Status imports or references
# This helps ensure complete migration to AirQoRequests

echo "Checking for any remaining direct Status imports..."
rg "from api\.utils\.http import Status"

echo "Checking for any remaining direct Status references..."
rg "Status\.HTTP_[0-9]+"

Length of output: 6382

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They seem to have the status code duplicated. This PR will not interfere with them. They will be updated at a later date.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicholasTurner23, understood. Would you like me to open a new GitHub issue to track updating the status code references in the other modules at a later date?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicholasTurner23, I've created a new GitHub issue to track updating the status code references in the 'insights' and 'device-monitoring' modules: Issue #3817.

🔗 Analysis chain
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Create a new GitHub issue to track updating status code references in other modules
gh issue create \
  --title "Update status code references in 'insights' and 'device-monitoring' modules" \
  --body "As discussed in [PR #3816](https://github.com/airqo-platform/AirQo-api/pull/3816) and [this comment](https://github.com/airqo-platform/AirQo-api/pull/3816#discussion_r1833105987), we need to update the status code references in the 'insights' and 'device-monitoring' modules to use \`AirQoRequests.Status\` instead of direct \`Status\` references.\n\nThis issue is to track this work.\n\nRequested by @NicholasTurner23." \
  --assignee NicholasTurner23

Length of output: 636

src/analytics/api/utils/data_formatters.py Outdated Show resolved Hide resolved
Comment on lines +364 to +365
devices = filter_non_private_devices(devices=json_data.get("devices", {})).get(
"devices", []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Standardize response structure across endpoints.

The response structure here uses a success parameter, while other endpoints don't. Consider standardizing the response structure across all endpoints.

-        return AirQoRequests.create_response(
-            message="exceedance data successfully fetched",
-            data=[
-                {
-                    "device_id": device_id,
-                    "total": sum(exceedances.values()),
-                    "exceedances": exceedances,
-                }
-                for device_id, exceedances in device_exceedances.items()
-            ],
-            success=AirQoRequests.Status.HTTP_200_OK,
-        )
+        return (
+            AirQoRequests.create_response(
+                "exceedance data successfully fetched",
+                data=[
+                    {
+                        "device_id": device_id,
+                        "total": sum(exceedances.values()),
+                        "exceedances": exceedances,
+                    }
+                    for device_id, exceedances in device_exceedances.items()
+                ],
+            ),
+            AirQoRequests.Status.HTTP_200_OK,
+        )

Also applies to: 378-388

Comment on lines +168 to +170
except urllib3.exceptions.HTTPError as ex:
logger.exception(f"HTTPError: {ex}")
return self.create_response(f"HTTPError: {ex}", success=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Enhance exception handling in the request method

Currently, only urllib3.exceptions.HTTPError exceptions are caught. Consider catching broader exceptions to handle unexpected errors gracefully.

Suggested change:

try:
    # existing request code
-except urllib3.exceptions.HTTPError as ex:
+except Exception as ex:
    logger.exception(f"An error occurred: {ex}")
    return self.create_response(f"An error occurred: {ex}", success=False)

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +152 to +154
else:
logger.exception("Method not supported")
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Handle unsupported HTTP methods gracefully

When an unsupported HTTP method is used, the method logs an exception and returns None. It's better to return a meaningful error response to inform the caller.

Suggested change:

else:
    logger.exception("Method not supported")
-   return None
+   return self.create_response(f"Error: Method '{method}' not supported", success=False)

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +199 to +215
def _get_valid_option(self, option, valid_options):
"""
Returns a validated option, defaulting to the first valid option if not provided or invalid.

Args:
option (str): Option provided in the request.
valid_options (list): List of valid options.

Returns:
str: A validated option from the list.
"""
return (
option.lower()
if option and option.lower() in valid_options
else valid_options[0]
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Review defaulting behavior in _get_valid_option method

The _get_valid_option method defaults to the first valid option if an invalid or no option is provided. This might lead to unexpected behavior for users who provide incorrect input. Consider returning an error response when an invalid option is provided to enhance input validation.

Comment on lines +107 to +109
data_type = self._get_valid_option(
json_data.get("datatype"), valid_options["data_types"]
)
Copy link
Contributor

@coderabbitai coderabbitai bot Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unused variable data_type

The variable data_type is assigned but never used in the code. Removing it will clean up the code and prevent any confusion.

Apply this diff to remove the unused variable:

-            data_type = self._get_valid_option(
-                json_data.get("datatype"), valid_options["data_types"]
-            )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
data_type = self._get_valid_option(
json_data.get("datatype"), valid_options["data_types"]
)
🧰 Tools
🪛 Ruff

107-107: Local variable data_type is assigned to but never used

Remove assignment to unused variable data_type

(F841)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is going to be used. This PR is a work in Progress.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicholasTurner23, understood. Thank you for the clarification.


✏️ Learnings added
Learnt from: NicholasTurner23
PR: airqo-platform/AirQo-api#3816
File: src/analytics/api/views/data.py:107-109
Timestamp: 2024-11-07T17:46:20.068Z
Learning: In this project, some PRs may be works in progress, and variables assigned but not yet used may be utilized later as the PR is completed.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Clean up

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
src/analytics/api/utils/data_formatters.py (2)

Line range hint 2-20: Remove unused import Config

The static analysis tool correctly identifies that Config is imported but never used in this file. Let's keep our imports clean and remove unused ones.

from api.utils.http import AirQoRequests
-from config import Config

logger = logging.getLogger(__name__)
🧰 Tools
🪛 Ruff

18-18: config.Config imported but unused

Remove unused import: config.Config

(F401)


326-352: Fix typo in docstring

There's a small typo in the docstring where "FilterS" should be "Filters". Also, the entities parameter in the docstring should be devices to match the actual parameter name.

def filter_non_private_devices(devices: List[str]) -> Dict[str, Any]:
    """
-    FilterS out private device IDs from a provided array of device IDs.
+    Filters out private device IDs from a provided array of device IDs.

    Args:
-        entities(List[str]): List of device/site ids to filter against.
+        devices(List[str]): List of device IDs to filter against.

    Returns:
        a response dictionary object that contains a list of non-private device ids if any.
    """
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between b4c0474 and 06c7230.

📒 Files selected for processing (1)
  • src/analytics/api/utils/data_formatters.py (4 hunks)
🧰 Additional context used
🪛 Ruff
src/analytics/api/utils/data_formatters.py

18-18: config.Config imported but unused

Remove unused import: config.Config

(F401)

🔇 Additional comments (2)
src/analytics/api/utils/data_formatters.py (2)

201-202: Well-structured type hints added

The addition of type hints enhances code clarity and helps with static type checking. The use of Union type for the return value properly handles both possible return types.


297-325: Well-implemented filtering function with proper error handling

The function is well-structured with:

  • Clear docstring documentation
  • Proper empty input validation
  • Robust error handling with logging
  • Correct instantiation of AirQoRequests

Copy link
Contributor

@Baalmart Baalmart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Baalmart Baalmart merged commit c186c56 into airqo-platform:staging Nov 7, 2024
43 of 44 checks passed
@Baalmart Baalmart mentioned this pull request Nov 7, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants