-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove database connection during query init #3128
base: master
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
FlowAuth Run #23998
Run Properties:
|
Project |
FlowAuth
|
Branch Review |
asyncio
|
Run status |
Passed #23998
|
Run duration | 00m 46s |
Commit |
127d2e9a69: Merge branch 'master' into asyncio
|
Committer | Jonathan Gray |
View all properties for this run ↗︎ |
Test results | |
---|---|
Failures |
0
|
Flaky |
0
|
Pending |
0
|
Skipped |
0
|
Passing |
4
|
View all changes introduced in this branch ↗︎ |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3128 +/- ##
==========================================
- Coverage 92.15% 91.56% -0.59%
==========================================
Files 275 279 +4
Lines 10755 10901 +146
Branches 1297 1314 +17
==========================================
+ Hits 9911 9982 +71
- Misses 691 756 +65
- Partials 153 163 +10 ☔ View full report in Codecov by Sentry. |
8f0da31
to
f09f328
Compare
732cc05
to
a1ba213
Compare
c88e304
to
f7a16e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (11)
flowmachine/flowmachine/core/flowdb_table.py (3)
5-6
: Remove unnecessary blank linesThere are two consecutive blank lines between imports and the class definition. One blank line is sufficient according to PEP 8.
from flowmachine.core.table import Table - - class FlowDBTable(Table, metaclass=ABCMeta): + +class FlowDBTable(Table, metaclass=ABCMeta):
14-16
: Enhance error message clarityThe current error message could be more informative by showing which specific columns are invalid.
- raise ValueError( - f"Columns {columns} must be a subset of {self.all_columns}" - ) + invalid_columns = set(columns) - set(self.all_columns) + raise ValueError( + f"Invalid columns: {invalid_columns}. " + f"Available columns are: {self.all_columns}" + )
18-20
: Add type hint and docstring to abstract propertyThe abstract property should include a return type hint and docstring to guide implementers.
@property - def all_columns(self): + def all_columns(self) -> list[str]: + """ + Get all available columns for this table. + + Returns: + List of column names available in this table + + Raises: + NotImplementedError: This property must be implemented by subclasses + """ raise NotImplementedErrorflowmachine/flowmachine/core/preflight.py (4)
20-24
: Add type hints and documentation to the decorator.The
pre_flight
decorator would benefit from proper documentation explaining its purpose, usage, and return type. This will improve maintainability and help other developers understand how to use it correctly.Apply this diff to enhance the documentation:
+from typing import Callable, TypeVar, Any + +T = TypeVar('T', bound=Callable[..., Any]) + -def pre_flight(method): +def pre_flight(method: T) -> T: + """Mark a method to be executed during pre-flight checks. + + This decorator adds the method to the pre-flight hooks that will be + executed before query execution to validate its runnability. + + Args: + method: The method to be marked for pre-flight checks + + Returns: + The decorated method + """ method.__hooks__ = getattr(method, "__hooks__", {}) method.__hooks__["pre_flight"] = method return method
57-61
: Simplify dictionary iteration.The implementation is robust, but we can simplify the dictionary iteration.
Apply this diff to improve readability:
- for key in hook_config.keys(): + for key in hook_config: hooks[key].append(attr_name)🧰 Tools
🪛 Ruff
57-57: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
31-34
: Add type hints to internal variables.Consider adding type hints to internal variables for better code clarity and type safety.
Apply this diff to enhance type safety:
- mro = inspect.getmro(cls) - hooks = defaultdict(list) + mro: typing.Tuple[type, ...] = inspect.getmro(cls) + hooks: typing.DefaultDict[str, list[str]] = defaultdict(list)
65-66
: Add class documentation and type hints.The
Preflight
class and its method would benefit from proper documentation and return type annotation.Apply this diff to improve documentation:
class Preflight: - def preflight(self): + """Base class for implementing pre-flight checks in queries. + + This class provides the infrastructure for running pre-flight validations + before query execution, ensuring that queries are runnable and valid. + """ + + def preflight(self) -> None: + """Execute pre-flight checks for this query and its dependencies. + + Raises: + PreFlightFailedException: If any pre-flight checks fail + """flowmachine/flowmachine/core/infrastructure_table.py (1)
9-27
: Consider adding column format documentation.The
date_of_first_service
anddate_of_last_service
columns would benefit from documentation specifying their expected date format and timezone handling.flowmachine/flowmachine/features/utilities/events_tables_union.py (2)
101-118
: Consider improving the default tables configuration.The hardcoded default tables tuple and the accompanying comment suggest this is a temporary solution. Consider:
- Moving the default tables to a configuration file
- Adding a method to dynamically fetch available tables
Would you like me to help implement a configuration-based solution for the default tables?
121-145
: Enhance error handling and logging in _make_table_list.While the error handling is comprehensive, consider these improvements:
- Add debug logging for successful table subset creation
- Include the actual date range in the warning message
def _make_table_list( *, tables, start, stop, columns, hours, subscriber_subset, subscriber_identifier ): """ Makes a list of EventTableSubset queries. """ date_subsets = [] for table in tables: try: sql = EventTableSubset( start=start, stop=stop, table=table, columns=columns, hours=hours, subscriber_subset=subscriber_subset, subscriber_identifier=subscriber_identifier, ) date_subsets.append(sql) + logger.debug(f"Successfully created subset for table {table}") except MissingDateError: - warnings.warn(f"No data in {table} for {start}–{stop}", stacklevel=2) + warnings.warn( + f"No data in {table} between {start.isoformat()} and {stop.isoformat()}", + stacklevel=2 + ) if not date_subsets: raise MissingDateError(start, stop) return date_subsetsflowmachine/flowmachine/core/cache.py (1)
490-501
: Add docstring to document the protected classes function.While the implementation with
@lru_cache
is excellent, the function would benefit from documentation explaining:
- The purpose of protected classes
- The source and significance of the returned class names
- The caching behaviour
Add this docstring:
@lru_cache(maxsize=1) def _get_protected_classes(): + """ + Get a list of class names that should be protected from cache operations. + + Returns + ------- + list[str] + List containing 'Table', 'GeoTable', and class names from events_table_map + and infrastructure_table_map. + + Note + ---- + Results are cached using lru_cache to avoid repeated computation. + """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (7)
- flowmachine/flowmachine/core/cache.py (10 hunks)
- flowmachine/flowmachine/core/errors/flowmachine_errors.py (1 hunks)
- flowmachine/flowmachine/core/events_table.py (1 hunks)
- flowmachine/flowmachine/core/flowdb_table.py (1 hunks)
- flowmachine/flowmachine/core/infrastructure_table.py (1 hunks)
- flowmachine/flowmachine/core/preflight.py (1 hunks)
- flowmachine/flowmachine/features/utilities/events_tables_union.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- flowmachine/flowmachine/core/errors/flowmachine_errors.py
🧰 Additional context used
🪛 Ruff
flowmachine/flowmachine/core/preflight.py
57-57: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
🔇 Additional comments (11)
flowmachine/flowmachine/core/events_table.py (2)
1-7
: LGTM! Clean base class implementation.The
EventsTable
base class follows good OOP principles with proper inheritance and parameter handling.
114-120
: LGTM! Clean table mapping implementation.The
events_table_map
provides a clean and maintainable way to look up table classes by name.flowmachine/flowmachine/core/infrastructure_table.py (3)
4-7
: Well-structured base class implementation!The
InfrastructureTable
class provides a clean foundation with proper use of keyword-only arguments and inheritance.
120-120
: Clean table mapping implementation!The
infrastructure_table_map
provides a clear and efficient way to look up table classes by name.
29-57
: Verify technical parameter constraints.Technical parameters like
height
,azimuth
,electrical_tilt
, andmechanical_downtilt
should have valid ranges. Consider implementing pre-flight checks to validate these values.flowmachine/flowmachine/features/utilities/events_tables_union.py (3)
46-47
: Well-structured type hints for date parameters!The addition of proper type hints using
Union[str, datetime.date, datetime.datetime]
improves code clarity and IDE support whilst maintaining flexibility in input formats.
65-65
: Improved error handling for column selection!The error message clearly explains why using "*" with multiple tables is not allowed, helping developers quickly understand and fix the issue.
Also applies to: 67-68
70-74
: Well-structured function call with explicit parameters!The explicit parameter passing improves code readability and maintainability.
flowmachine/flowmachine/core/cache.py (3)
195-198
: Excellent enhancement to error handling and validation!The addition of preflight checks and consistent error handling through the state machine significantly improves the robustness of the cache writing process. Each stage (preflight, SQL generation, execution, and metadata writing) now properly updates the query state on failure.
Also applies to: 210-212, 226-228
308-308
: Good addition of structured logging!The debug logging for cache touches improves observability by including both the query_id and query details.
532-532
: Well-structured SQL query modifications!The use of
ARRAY{_get_protected_classes()}
in the SQL queries provides a consistent and maintainable approach to filtering protected classes. The implementation properly leverages PostgreSQL's array operators.Also applies to: 712-714
"outgoing", | ||
"datetime", | ||
"duration", | ||
"network", | ||
"msisdn", | ||
"msisdn_counterpart", | ||
"location_id", | ||
"imsi", | ||
"imei", | ||
"tac", | ||
"operator_code", | ||
"country_code", | ||
] | ||
|
||
def __init__(self, *, columns=None): | ||
super().__init__(name="calls", columns=columns) | ||
|
||
|
||
class ForwardsTable(EventsTable): | ||
all_columns = [ | ||
"id", | ||
"outgoing", | ||
"datetime", | ||
"network", | ||
"msisdn", | ||
"msisdn_counterpart", | ||
"location_id", | ||
"imsi", | ||
"imei", | ||
"tac", | ||
"operator_code", | ||
"country_code", | ||
] | ||
|
||
def __init__(self, *, columns=None): | ||
super().__init__(name="forwards", columns=columns) | ||
|
||
|
||
class SmsTable(EventsTable): | ||
all_columns = [ | ||
"id", | ||
"outgoing", | ||
"datetime", | ||
"network", | ||
"msisdn", | ||
"msisdn_counterpart", | ||
"location_id", | ||
"imsi", | ||
"imei", | ||
"tac", | ||
"operator_code", | ||
"country_code", | ||
] | ||
|
||
def __init__(self, *, columns): | ||
super().__init__(name="sms", columns=columns) | ||
|
||
|
||
class MdsTable(EventsTable): | ||
all_columns = [ | ||
"id", | ||
"datetime", | ||
"duration", | ||
"volume_total", | ||
"volume_upload", | ||
"volume_download", | ||
"msisdn", | ||
"location_id", | ||
"imsi", | ||
"imei", | ||
"tac", | ||
"operator_code", | ||
"country_code", | ||
] | ||
|
||
def __init__(self, *, columns): | ||
super().__init__(name="mds", columns=columns) | ||
|
||
|
||
class TopupsTable(EventsTable): | ||
all_columns = [ | ||
"id", | ||
"datetime", | ||
"type", | ||
"recharge_amount", | ||
"airtime_fee", | ||
"tax_and_fee", | ||
"pre_event_balance", | ||
"post_event_balance", | ||
"msisdn", | ||
"location_id", | ||
"imsi", | ||
"imei", | ||
"tac", | ||
"operator_code", | ||
"country_code", | ||
] | ||
|
||
def __init__(self, *, columns): | ||
super().__init__(name="topups", columns=columns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider using composition to manage column definitions.
The current implementation has significant duplication in column definitions. Consider restructuring using composition to improve maintainability:
class ColumnSets:
COMMON = [
"id",
"datetime",
"location_id",
"msisdn",
]
DEVICE = [
"imsi",
"imei",
"tac",
]
NETWORK = [
"operator_code",
"country_code",
]
INTERACTION = [
"msisdn_counterpart",
"network",
"outgoing",
]
class CallsTable(EventsTable):
all_columns = [
*ColumnSets.COMMON,
*ColumnSets.DEVICE,
*ColumnSets.NETWORK,
*ColumnSets.INTERACTION,
"duration",
]
This approach would:
- Make column groups more maintainable
- Reduce duplication
- Make it easier to ensure consistency across tables
- Facilitate adding new tables with similar column patterns
try: | ||
getattr(dependency, hook)() | ||
except Exception as e: | ||
errors.setdefault(str(dependency), list()).append(e) | ||
if len(errors) > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve exception handling and type safety.
The current broad exception handling could mask important errors. Additionally, the error accumulation could benefit from type hints.
Apply this diff to enhance error handling:
+ ErrorDict = typing.Dict[str, typing.List[Exception]]
+
+ def preflight(self) -> None:
- errors = dict()
+ errors: self.ErrorDict = dict()
# ... existing code ...
try:
getattr(dependency, hook)()
- except Exception as e:
+ except (ValueError, TypeError, AttributeError) as e:
errors.setdefault(str(dependency), list()).append(e)
+ except Exception as e:
+ logger.error(
+ "Unexpected error during pre-flight",
+ error=str(e),
+ dependency=str(dependency),
+ hook=hook
+ )
+ raise
Committable suggestion was skipped due to low confidence.
class TacsTable(InfrastructureTable): | ||
all_columns = [ | ||
"id", | ||
"brand", | ||
"model", | ||
"width", | ||
"height", | ||
"depth", | ||
"weight", | ||
"display_type", | ||
"display_colors", | ||
"display_width", | ||
"display_height", | ||
"mms_receiver", | ||
"mms_built_in_camera", | ||
"wap_push_ota_support", | ||
"hardware_gprs", | ||
"hardware_edge", | ||
"hardware_umts", | ||
"hardware_wifi", | ||
"hardware_bluetooth", | ||
"hardware_gps", | ||
"software_os_vendor", | ||
"software_os_name", | ||
"software_os_version", | ||
"wap_push_ota_settings", | ||
"wap_push_ota_bookmarks", | ||
"wap_push_ota_app_internet", | ||
"wap_push_ota_app_browser", | ||
"wap_push_ota_app_mms", | ||
"wap_push_ota_single_shot", | ||
"wap_push_ota_multi_shot", | ||
"wap_push_oma_settings", | ||
"wap_push_oma_app_internet", | ||
"wap_push_oma_app_browser", | ||
"wap_push_oma_app_mms", | ||
"wap_push_oma_cp_bookmarks", | ||
"wap_1_2_1", | ||
"wap_2_0", | ||
"syncml_dm_settings", | ||
"syncml_dm_acc_gprs", | ||
"syncml_dm_app_internet", | ||
"syncml_dm_app_browser", | ||
"syncml_dm_app_mms", | ||
"syncml_dm_app_bookmark", | ||
"syncml_dm_app_java", | ||
"wap_push_oma_app_ims", | ||
"wap_push_oma_app_poc", | ||
"j2me_midp_10", | ||
"j2me_midp_20", | ||
"j2me_midp_21", | ||
"j2me_cldc_10", | ||
"j2me_cldc_11", | ||
"j2me_cldc_20", | ||
"hnd_type", | ||
] | ||
|
||
def __init__(self, *, columns=None): | ||
super().__init__(name="tacs", columns=columns) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider grouping related columns for improved maintainability.
The extensive list of columns could be organised into logical groups using class attributes. This would improve readability and make maintenance easier.
Consider restructuring like this:
class TacsTable(InfrastructureTable):
DEVICE_SPECS = [
"id", "brand", "model",
"width", "height", "depth", "weight"
]
DISPLAY_SPECS = [
"display_type", "display_colors",
"display_width", "display_height"
]
HARDWARE_CAPABILITIES = [
"hardware_gprs", "hardware_edge", "hardware_umts",
"hardware_wifi", "hardware_bluetooth", "hardware_gps"
]
SOFTWARE_INFO = [
"software_os_vendor", "software_os_name",
"software_os_version"
]
WAP_CAPABILITIES = [
"wap_1_2_1", "wap_2_0",
# ... other WAP-related columns
]
all_columns = (DEVICE_SPECS + DISPLAY_SPECS +
HARDWARE_CAPABILITIES + SOFTWARE_INFO +
WAP_CAPABILITIES)
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (9)
flowmachine/tests/test_union_with_fixed_values.py (1)
Line range hint
82-94
: Consider adding explicit column specifications to error test cases.For consistency, consider adding explicit column specifications to the Table constructors in the remaining error test cases:
- [Table("events.calls_20160101"), Table("events.calls_20160101")], + [Table("events.calls_20160101", columns=["msisdn"]), Table("events.calls_20160101", columns=["msisdn"])], - [Table("events.calls_20160101"), Table("events.calls_20160101")], + [Table("events.calls_20160101", columns=["msisdn"]), Table("events.calls_20160101", columns=["msisdn"])],flowmachine/flowmachine/core/preflight.py (3)
20-24
: Add type hints and docstring to the decorator.The decorator would benefit from:
- Type hints for better code clarity
- A docstring explaining its purpose and usage
Consider this enhancement:
-def pre_flight(method): +def pre_flight(method: typing.Callable) -> typing.Callable: + """Mark a method to be executed during pre-flight checks. + + Args: + method: The method to be marked for pre-flight execution + + Returns: + The decorated method + """ method.__hooks__ = getattr(method, "__hooks__", {}) method.__hooks__["pre_flight"] = method return method
26-63
: LGTM! Consider enhancing type hints for better clarity.The function effectively resolves hooks through the class hierarchy with robust error handling. Consider adding more specific type hints:
-def resolve_hooks(cls) -> typing.Dict[str, typing.List[typing.Callable]]: +def resolve_hooks( + cls: type +) -> typing.Dict[str, typing.List[typing.Callable[..., None]]]:
65-96
: Add class documentation and type hints.The
Preflight
class would benefit from:
- A class docstring explaining its purpose and usage
- Type hints for the
preflight
methodConsider these enhancements:
class Preflight: + """Base class for implementing pre-flight checks. + + This class provides functionality to verify query validity before execution + by running pre-flight checks on the query and its dependencies. + """ - def preflight(self): + def preflight(self) -> None: + """Execute pre-flight checks for this query and its dependencies. + + Raises: + PreFlightFailedException: If any pre-flight checks fail + """flowmachine/tests/test_cache.py (1)
42-42
: LGTM! Consider consolidating preflight checks.The addition of preflight() calls before cache operations is correct and aligns with the PR objectives. These checks ensure query validity before caching.
Consider extracting this common pattern into a helper method to reduce duplication:
def cache_query(query: Query) -> None: query.preflight() with get_db().engine.begin() as trans: write_cache_metadata(trans, query)Also applies to: 55-55, 70-70
flowmachine/flowmachine/core/server/action_handlers.py (1)
Line range hint
89-117
: Add exception chaining to preserve error context.The exception handling blocks should preserve the original exception context using the
from
keyword.Apply this diff to fix the exception handling:
except PreFlightFailedException as exc: orig_error_msg = exc.args[0] error_msg = ( f"Internal flowmachine server error: could not create query object using query schema. " f"The original error was: '{orig_error_msg}'" ) raise QueryLoadError( error_msg, params, orig_error_msg=orig_error_msg, - ) + ) from exc except TypeError as exc: orig_error_msg = exc.args[0] error_msg = ( f"Internal flowmachine server error: could not create query object using query schema. " f"The original error was: '{orig_error_msg}'" ) raise QueryLoadError( error_msg, params, orig_error_msg=orig_error_msg, - ) + ) from exc🧰 Tools
🪛 Ruff
85-85: Undefined name
BaseExposedQuery
(F821)
95-99: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
flowmachine/tests/test_query_object_construction.test_construct_query.approved.txt (2)
Line range hint
71-163
: Consider validating polygon coordinates at runtime.The meaningful locations queries now include detailed polygon coordinates. While the structure is well-defined, consider adding runtime validation to ensure:
- Coordinates form valid polygons
- Coordinates are within expected bounds
- No self-intersecting polygons
Consider adding a validation helper:
def validate_polygon_coordinates(coordinates): """ Validates polygon coordinates for meaningful locations. Args: coordinates: List of coordinate pairs Raises: ValueError: If coordinates are invalid """ if not coordinates or not coordinates[0]: raise ValueError("Empty coordinates") # Ensure polygon is closed if coordinates[0][0] != coordinates[-1][0]: raise ValueError("Polygon must be closed") # Add more validation as needed
Line range hint
35-44
: Document the rationale for nullable event types.Setting
event_types
to null broadens the query scope. Ensure this change is intentional and document the expected behaviour when event types are null.Consider adding a comment explaining the null event types behaviour:
# When event_types is null, the query includes all available event types
flowmachine/flowmachine/core/query.py (1)
62-62
: Well-structured inheritance from Preflight class!The inheritance from
Preflight
effectively supports the PR's goal of removing database connection during query initialisation, enabling pre-flight checks without requiring an active connection.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (13)
- flowmachine/flowmachine/core/flowdb_table.py (1 hunks)
- flowmachine/flowmachine/core/preflight.py (1 hunks)
- flowmachine/flowmachine/core/query.py (5 hunks)
- flowmachine/flowmachine/core/server/action_handlers.py (5 hunks)
- flowmachine/flowmachine/core/server/query_schemas/base_schema.py (2 hunks)
- flowmachine/tests/functional_tests/approved_files/test_sql_strings_and_results.test_daily_location_1_sql.approved.txt (1 hunks)
- flowmachine/tests/functional_tests/approved_files/test_sql_strings_and_results.test_daily_location_2_sql.approved.txt (1 hunks)
- flowmachine/tests/functional_tests/approved_files/test_sql_strings_and_results.test_daily_location_4_sql.approved.txt (1 hunks)
- flowmachine/tests/functional_tests/approved_files/test_sql_strings_and_results.test_daily_location_6_sql.approved.txt (1 hunks)
- flowmachine/tests/test_cache.py (5 hunks)
- flowmachine/tests/test_query_object_construction.test_construct_query.approved.txt (11 hunks)
- flowmachine/tests/test_redacted_total_events.py (3 hunks)
- flowmachine/tests/test_union_with_fixed_values.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- flowmachine/flowmachine/core/flowdb_table.py
- flowmachine/flowmachine/core/server/query_schemas/base_schema.py
- flowmachine/tests/functional_tests/approved_files/test_sql_strings_and_results.test_daily_location_4_sql.approved.txt
- flowmachine/tests/test_redacted_total_events.py
🧰 Additional context used
🪛 Ruff
flowmachine/flowmachine/core/server/action_handlers.py
95-99: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
113-117: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
🔇 Additional comments (16)
flowmachine/tests/functional_tests/approved_files/test_sql_strings_and_results.test_daily_location_1_sql.approved.txt (1)
30-34
: Excellent optimisation of the geography.admin3 subquery!The reduction to only essential columns (admin3pcod, admin3name, geom) is a positive change that should improve query performance by reducing unnecessary data transfer whilst maintaining all required functionality.
flowmachine/tests/test_union_with_fixed_values.py (3)
16-19
: LGTM! Explicit column specifications improve test clarity.The addition of explicit column specifications makes the test's requirements clearer and aligns with the broader goal of reducing database dependency during object instantiation.
Also applies to: 24-27
34-34
: LGTM! Column specification matches test usage.The explicit column specification correctly identifies the "id" column that's being used in the test assertions.
46-49
: LGTM! Consistent column specifications.The explicit column specifications are consistent across both tables and maintain the test's original behaviour.
flowmachine/tests/functional_tests/approved_files/test_sql_strings_and_results.test_daily_location_6_sql.approved.txt (1)
31-34
: Previous review comment about unused admin3name column is still applicable.The selection of
admin3name
in the subquery remains unnecessary as it's not used in the final results.flowmachine/flowmachine/core/preflight.py (1)
1-18
: LGTM! Well-structured imports and logger setup.The imports are logically grouped and the logger configuration follows best practices by including the submodule name.
flowmachine/tests/functional_tests/approved_files/test_sql_strings_and_results.test_daily_location_2_sql.approved.txt (1)
48-49
: Verify spatial index usage for optimal performance.The spatial join using
st_within
andst_setsrid
is computationally expensive. Ensure that a spatial index exists on the geography.admin2.geom column for optimal query performance.flowmachine/tests/test_cache.py (2)
316-316
: LGTM! Improved query composition.The change to use
self.nested.get_query()
instead of a static string is a good improvement. It properly delegates query generation to the nested query object, making the code more maintainable and flexible.
150-152
: Please clarify the comment about Table object dependencies.The assertion
assert not has_deps
suggests we expect no dependencies, but the comment implies there should be remaining dependencies from Table objects. This appears contradictory.Let's verify the expected behaviour:
✅ Verification successful
The previous command failed because the path was incorrect. Let's try with the correct path structure as seen in the review comment. This will help us understand:
- The context around the assertion
- Other similar assertions in the test suite
- The overall test case structure and setup
After reviewing the full test file, I can now provide a clear response about the assertion and comment.
The comment about Table dependencies is accurate and not contradictory
The assertion
assert not has_deps
intest_invalidate_cache_multi
is correct. Here's why:
- The test verifies that invalidating a cached query removes both the query and its dependencies from the cache.dependencies table
- The comment "the remaining dependencies are due to underlying Table objects" is explaining that whilst there might be Table objects with dependencies elsewhere in the system, they aren't tracked in the cache.dependencies table
- This is confirmed by the subsequent test
test_invalidate_cache_midchain
which explicitly shows that Table dependencies are handled differently, where it assertshas_deps
is True and comments "Daily location deps should remain"The assertion and comment are working together to document the expected behaviour: the cache.dependencies table should be empty (hence
assert not has_deps
), even though Table object dependencies continue to exist elsewhere in the system.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for similar assertions in other test files rg -B 5 -A 5 "has_deps.*assert" tests/Length of output: 87
Script:
#!/bin/bash # Search for similar assertions in the correct path rg -B 5 -A 5 "has_deps.*assert" flowmachine/tests/ # Let's also look at the full test file to understand the context cat flowmachine/tests/test_cache.pyLength of output: 11984
flowmachine/flowmachine/core/server/action_handlers.py (2)
17-17
: LGTM! Well-structured imports for enhanced error handling and logging.The addition of structlog and PreFlightFailedException imports aligns well with the PR's objective of improving query initialization and error handling.
Also applies to: 45-47
223-240
:⚠️ Potential issueUse appropriate logging level for exceptions.
Using
logger.debug
for exceptions might cause important error information to be missed in production. Consider usinglogger.error
orlogger.exception
for better visibility of issues.Apply this diff to improve the logging:
except Exception as exc: - logger.debug(str(exc), exception=exc, traceback=traceback.format_exc()) + logger.exception("Failed to create query object", exc_info=exc) return ZMQReply( status="error", msg="Unable to create query object.", payload={"exception": str(exc)}, )Likely invalid or redundant comment.
flowmachine/tests/test_query_object_construction.test_construct_query.approved.txt (3)
Line range hint
1-892
: LGTM: Configuration consistency is well maintained.The query configurations demonstrate good consistency across:
- Aggregation units
- Weight thresholds
- Date formats
- Parameter naming
Line range hint
358-373
: Verify the impact of join type on query results.The join type "left outer" might exclude relevant data points. Consider documenting the rationale for using left outer join instead of full outer join in this context.
✅ Verification successful
Let me gather more context about the flows implementation to understand the join type usage better.
Let me check one more thing to understand the flows implementation better.
The "left outer" join type is appropriate for this flow query
Based on the codebase analysis:
- The
Flows
class explicitly supports multiple join types including "left outer" and "full outer" joins- The test cases demonstrate that "left outer" join is intentionally used to retain all records from the first location query while allowing null values in the destination locations
- The test
test_flows_outer_join
specifically validates this behaviour, ensuring thatpcod_to
can be null whilepcod_from
is always populatedThe use of "left outer" join in this context is a valid design choice as it preserves all source locations whilst allowing for cases where subscribers may not have a corresponding destination location.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for consistency in join types across flow queries # Search for join type definitions in flow queries rg -t py "join_type.*(?:left|right|full|inner).*outer"Length of output: 1079
Script:
#!/bin/bash # Check the implementation of flows to understand join type usage ast-grep --pattern 'class Flows { $$$ }' # Also check for any documentation or comments about join types rg -B 2 -A 2 "join_type.*(?:left|right|full|inner).*outer" flowmachine/flowmachine/features/location/flows.pyLength of output: 494
Script:
#!/bin/bash # Check the flows class implementation and docstrings rg -B 5 -A 10 "class Flows" flowmachine/flowmachine/features/location/flows.py # Also check the test file to understand the test cases rg -B 5 -A 10 "def test.*flows" flowmachine/tests/test_flows.pyLength of output: 4453
2-2
: Verify the impact of query identifier changes across the system.All query identifiers have been updated, which might affect caching and query result retrieval. Ensure that the system properly handles these new identifiers, especially for cached results.
Also applies to: 23-23, 35-35, 45-45, 67-67, 71-71, 164-164, 259-259, 358-358, 374-374, 421-421, 643-643
flowmachine/flowmachine/core/query.py (2)
390-397
: Robust implementation of get_table method!The addition of
preflight()
check and improved error handling enhances the reliability of table creation. The clear error message for non-stored queries improves debugging experience.
373-374
: Verify input sanitisation for SQL query construction.While the f-string usage improves readability, ensure that
n
is properly validated to prevent potential SQL injection attacks. Consider adding input validation for then
parameter.
[ | ||
Table("events.calls_20160101", columns=["msisdn"]), | ||
Table("events.calls_20160101", columns=["msisdn"]), | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Missing column specifications in test_union_raises_with_mismatched_extras_types and test_union_raises_with_not_enough_extras functions
The review comment's verification revealed inconsistencies in the codebase. While some Table instances have been updated with explicit column specifications, there are two test functions where the Table instantiations still lack the column parameter:
- In
test_union_raises_with_mismatched_extras_types
:[Table("events.calls_20160101"), Table("events.calls_20160101")]
- In
test_union_raises_with_not_enough_extras
:[Table("events.calls_20160101"), Table("events.calls_20160101")]
These instances should be updated with explicit column specifications to maintain consistency with the rest of the codebase.
🔗 Analysis chain
LGTM! Column specifications added while preserving type testing.
The explicit column specifications have been added consistently, maintaining the test's focus on datetime type handling.
Let's verify that all Table instances in the file have been updated with explicit columns:
Let me verify the entire file content to understand the context better before making a conclusion.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any Table instantiations without explicit column specifications
# Expect: No matches as all should now specify columns
rg "Table\([^)]+\)" flowmachine/tests/test_union_with_fixed_values.py | rg -v "columns="
Length of output: 244
Script:
#!/bin/bash
# Get the full content of the test file
cat flowmachine/tests/test_union_with_fixed_values.py
Length of output: 3110
...nal_tests/approved_files/test_sql_strings_and_results.test_daily_location_6_sql.approved.txt
Show resolved
Hide resolved
INNER JOIN (SELECT admin2pcod, | ||
admin2name, | ||
admin2pcod, | ||
admin2refn, | ||
admin2altn, | ||
admin2al_1, | ||
date, | ||
validon, | ||
validto, | ||
shape_star, | ||
shape_stle, | ||
shape_leng, | ||
shape_area, | ||
geom | ||
FROM geography.admin2) AS geom_table ON st_within(CAST(loc_table.geom_point AS geometry), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider removing unused admin2name column.
The reduction of columns from the geography.admin2 table improves query performance. However, the admin2name
column doesn't appear to be used in the final output or any join conditions. Consider removing it unless it's needed for debugging or logging purposes.
INNER JOIN (SELECT admin2pcod,
- admin2name,
geom
FROM geography.admin2) AS geom_table
Committable suggestion was skipped due to low confidence.
@@ -80,6 +85,18 @@ async def action_handler__get_query_schemas( | |||
def _load_query_object(params: dict) -> "BaseExposedQuery": | |||
try: | |||
query_obj = FlowmachineQuerySchema().load(params) | |||
query_obj._flowmachine_query_obj.preflight() # Note that we probably want to remove this call to allow getting qid faster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider deferring the preflight check.
The comment suggests that this preflight check might be unnecessary at this point and could slow down query ID retrieval. Consider moving this check to a later stage in the query lifecycle.
The order that the preflight errors are in is nondeterministic for dependencies of the same level in the tree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- flowmachine/flowmachine/core/server/action_handlers.py (4 hunks)
- flowmachine/tests/server/test_action_handlers.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- flowmachine/tests/server/test_action_handlers.py
🧰 Additional context used
🪛 Ruff
flowmachine/flowmachine/core/server/action_handlers.py
95-97: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
🔇 Additional comments (3)
flowmachine/flowmachine/core/server/action_handlers.py (3)
17-17
: LGTM! Well-structured logging setup.The addition of structlog and the logger configuration with submodule name will improve debugging capabilities.
Also applies to: 45-48
229-233
:⚠️ Potential issueUse appropriate log level for exceptions
Using
logger.debug
might not adequately capture critical exceptions. Consider usinglogger.error
orlogger.exception
.Apply this diff to improve the logging:
- logger.debug(str(exc), exception=exc, traceback=traceback.format_exc()) + logger.exception("Failed to create query object", exc_info=exc)Likely invalid or redundant comment.
88-88
: 🛠️ Refactor suggestionConsider deferring the preflight check
The comment suggests that this preflight check might be unnecessary at this point. Consider moving it to a later stage in the query lifecycle, perhaps during the actual query execution in
store_async
.Let's verify the impact of this preflight check:
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Outside diff range and nitpick comments (4)
flowmachine/flowmachine/core/events_table.py (1)
8-124
: Enhance code documentation and type safety.Consider the following improvements:
- Add docstrings to describe the purpose and usage of each class
- Add type hints for better IDE support and code clarity
- Consider adding runtime validation for column names against the database schema
Example enhancement:
from typing import Optional, List, Dict, Type class EventsTable(FlowDBTable): """Base class for all event tables in the FlowDB. Provides common functionality for accessing event data with specific schema and columns. """ def __init__(self, *, name: str, columns: Optional[List[str]]) -> None: super().__init__(schema="events", name=name, columns=columns)Would you like me to generate the complete set of type hints and docstrings for all classes?
flowmachine/flowmachine/core/infrastructure_table.py (3)
8-10
: Consider adding input parameter validation.The constructor accepts
name
andcolumns
without validation. Consider adding checks to ensure:
name
is a non-empty stringcolumns
is either None or a non-empty list of stringsdef __init__(self, *, name, columns): + if not isinstance(name, str) or not name.strip(): + raise ValueError("name must be a non-empty string") + if columns is not None and (not isinstance(columns, list) or not all(isinstance(col, str) for col in columns)): + raise ValueError("columns must be None or a list of strings") super().__init__(schema="infrastructure", name=name, columns=columns)
13-31
: Add docstring to describe the table's purpose and column definitions.Consider adding class-level documentation to explain:
- The purpose of the sites table
- The meaning and expected format of each column
- The relationship between
site_id
andid
- The geometry column formats
class SitesTable(InfrastructureTable): """Represents infrastructure sites with their locations and properties. Columns: site_id (str): Unique identifier for the site id (str): Internal identifier version (str): Version control identifier name (str): Human-readable site name type (str): Type of site status (str): Current operational status structure_type (str): Physical structure classification is_cow (bool): Whether this is a Cell-on-Wheels date_of_first_service (date): Service start date date_of_last_service (date): Service end date geom_point (geometry): Point geometry of site location geom_polygon (geometry): Polygon geometry of site boundary """
124-124
: Add type hints to improve code clarity.Consider adding type hints to make the mapping's types explicit:
-infrastructure_table_map = dict(tacs=TacsTable, cells=CellsTable, sites=SitesTable) +infrastructure_table_map: dict[str, type[InfrastructureTable]] = { + "tacs": TacsTable, + "cells": CellsTable, + "sites": SitesTable, +}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (7)
- flowmachine/flowmachine/core/events_table.py (1 hunks)
- flowmachine/flowmachine/core/flowdb_table.py (1 hunks)
- flowmachine/flowmachine/core/infrastructure_table.py (1 hunks)
- flowmachine/flowmachine/core/server/action_handlers.py (4 hunks)
- flowmachine/flowmachine/core/table.py (4 hunks)
- flowmachine/tests/test_cache_utils.py (5 hunks)
- flowmachine/tests/test_table.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- flowmachine/flowmachine/core/flowdb_table.py
- flowmachine/tests/test_cache_utils.py
🧰 Additional context used
🪛 Ruff
flowmachine/flowmachine/core/server/action_handlers.py
95-97: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
flowmachine/tests/test_table.py
7-7:
flowmachine.core.errors.flowmachine_errors.QueryErroredException
imported but unusedRemove unused import:
flowmachine.core.errors.flowmachine_errors.QueryErroredException
(F401)
50-50: Local variable
t
is assigned to but never usedRemove assignment to unused variable
t
(F841)
🔇 Additional comments (12)
flowmachine/flowmachine/core/events_table.py (2)
8-10
: Well-structured base class implementation!The base class is well-designed with keyword-only arguments and clear inheritance hierarchy.
118-124
: Excellent table mapping implementation!The dictionary provides an efficient way to map table names to their respective classes, facilitating dynamic instantiation.
flowmachine/tests/test_table.py (4)
19-45
: Well-structured parameterised testsGood implementation of parameterised tests with clear separation between initialisation errors and preflight failures. The test cases effectively cover various error scenarios.
Line range hint
53-69
: LGTM! Clear and focused test casesThe test cases are well-documented and effectively verify the intended functionality.
🧰 Tools
🪛 Ruff
7-7:
flowmachine.core.errors.flowmachine_errors.QueryErroredException
imported but unusedRemove unused import:
flowmachine.core.errors.flowmachine_errors.QueryErroredException
(F401)
50-50: Local variable
t
is assigned to but never usedRemove assignment to unused variable
t
(F841)
Line range hint
71-106
: Well-implemented storage and dependency testsGood coverage of storage-related functionality with clear assertions and proper cleanup.
Line range hint
108-116
: Thorough serialisation testGood verification of both query content and identity preservation during serialisation.
flowmachine/flowmachine/core/infrastructure_table.py (1)
63-122
: Previous suggestion about grouping columns remains applicable.A past review comment suggested grouping the columns into logical categories for improved maintainability.
flowmachine/flowmachine/core/table.py (2)
12-17
: LGTM: Well-structured importsThe new imports are logically organised and support the pre-flight functionality additions.
66-67
: LGTM: Improved constructor signatureThe changes enforce better parameter handling by:
- Making
name
a required parameter- Using
*
to enforce keyword-only arguments for optional parameters- Adding proper type hints
flowmachine/flowmachine/core/server/action_handlers.py (3)
17-17
: LGTM: Enhanced logging and error handling setupThe addition of structlog and PreFlightFailedException imports, along with the logger setup, improves the error handling and debugging capabilities.
Also applies to: 45-48
216-233
: LGTM: Comprehensive error handling and loggingThe error handling for PreFlightFailedException is well-structured with:
- Clear error messages
- Detailed payload information
- Appropriate error logging with traceback
88-88
: 🛠️ Refactor suggestionConsider deferring the preflight check
The preflight check at query object creation time might impact performance, especially for query ID retrieval operations that don't require immediate validation.
Let's verify the performance impact:
Consider moving this check to a later stage in the query lifecycle, such as just before query execution.
def __init__(self, *, columns=None): | ||
super().__init__(name="calls", columns=columns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Standardise the constructor parameter requirements.
The columns
parameter is inconsistently handled across table classes:
- Optional in
CallsTable
andForwardsTable
- Required in
SmsTable
,MdsTable
, andTopupsTable
This inconsistency could lead to confusion and maintenance issues.
Consider standardising the constructor signatures. If columns
should be optional, apply this pattern:
- def __init__(self, *, columns):
+ def __init__(self, *, columns=None):
Also applies to: 50-51, 70-71, 91-92, 114-115
flowmachine/tests/test_table.py
Outdated
"""Test that where no schema is provided, public schema is checked.""" | ||
t = Table("gambia_admin2") | ||
t = Table("gambia_admin2", columns=["geom"]).preflight() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused variable assignment
The result of preflight()
is assigned to t
but never used.
Apply this diff:
- t = Table("gambia_admin2", columns=["geom"]).preflight()
+ Table("gambia_admin2", columns=["geom"]).preflight()
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
t = Table("gambia_admin2", columns=["geom"]).preflight() | |
Table("gambia_admin2", columns=["geom"]).preflight() |
🧰 Tools
🪛 Ruff
50-50: Local variable
t
is assigned to but never usedRemove assignment to unused variable
t
(F841)
from flowmachine.core.errors.flowmachine_errors import ( | ||
QueryErroredException, | ||
PreFlightFailedException, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused import QueryErroredException
The QueryErroredException
is imported but not used in any of the test cases.
Apply this diff to remove the unused import:
from flowmachine.core.errors.flowmachine_errors import (
- QueryErroredException,
PreFlightFailedException,
)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from flowmachine.core.errors.flowmachine_errors import ( | |
QueryErroredException, | |
PreFlightFailedException, | |
) | |
from flowmachine.core.errors.flowmachine_errors import ( | |
PreFlightFailedException, | |
) |
🧰 Tools
🪛 Ruff
7-7:
flowmachine.core.errors.flowmachine_errors.QueryErroredException
imported but unusedRemove unused import:
flowmachine.core.errors.flowmachine_errors.QueryErroredException
(F401)
# Record provided columns to ensure that query_id differs with different columns | ||
if isinstance(columns, str): # Wrap strings in a list | ||
columns = [columns] | ||
self.columns = columns | ||
if self.columns is None or len(self.columns) == 0: | ||
raise ValueError("No columns requested.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider enhancing column validation
The current implementation raises a ValueError
when columns
is either None
or empty. However, it might be more user-friendly to:
- Use
columns
from the database whenNone
is provided - Provide a more descriptive error message specifying the table name
- if self.columns is None or len(self.columns) == 0:
- raise ValueError("No columns requested.")
+ if self.columns is None:
+ self.columns = self._get_db_columns()
+ elif len(self.columns) == 0:
+ raise ValueError(f"No columns requested for table {self.fqn}")
Committable suggestion was skipped due to low confidence.
def check_columns(self): | ||
# Get actual columns of this table from the database | ||
db_columns = list( | ||
zip( | ||
*get_db().fetch( | ||
f"""SELECT column_name from INFORMATION_SCHEMA.COLUMNS | ||
WHERE table_name = '{self.name}' AND table_schema='{self.schema}'""" | ||
WHERE table_name = '{self.name}' AND table_schema='{self.schema}'""" | ||
) | ||
) | ||
)[0] | ||
if ( | ||
columns is None or columns == [] | ||
): # No columns specified, setting them from the database | ||
columns = db_columns | ||
else: | ||
self.parent_table = Table( | ||
schema=self.schema, name=self.name | ||
) # Point to the full table | ||
if isinstance(columns, str): # Wrap strings in a list | ||
columns = [columns] | ||
logger.debug( | ||
"Checking provided columns against db columns.", | ||
provided=columns, | ||
db_columns=db_columns, | ||
|
||
logger.debug( | ||
"Checking provided columns against db columns.", | ||
provided=self.columns, | ||
db_columns=db_columns, | ||
) | ||
if not set(self.columns).issubset(db_columns): | ||
raise ValueError( | ||
f"{set(self.columns).difference(db_columns)} are not columns of {self.fqn}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix SQL injection vulnerability
The SQL query in check_columns
is vulnerable to SQL injection as it directly interpolates values into the query string. Use parameterised queries instead.
- f"""SELECT column_name from INFORMATION_SCHEMA.COLUMNS
- WHERE table_name = '{self.name}' AND table_schema='{self.schema}'"""
+ """SELECT column_name from INFORMATION_SCHEMA.COLUMNS
+ WHERE table_name = %s AND table_schema = %s""",
+ (self.name, self.schema)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def check_columns(self): | |
# Get actual columns of this table from the database | |
db_columns = list( | |
zip( | |
*get_db().fetch( | |
f"""SELECT column_name from INFORMATION_SCHEMA.COLUMNS | |
WHERE table_name = '{self.name}' AND table_schema='{self.schema}'""" | |
WHERE table_name = '{self.name}' AND table_schema='{self.schema}'""" | |
) | |
) | |
)[0] | |
if ( | |
columns is None or columns == [] | |
): # No columns specified, setting them from the database | |
columns = db_columns | |
else: | |
self.parent_table = Table( | |
schema=self.schema, name=self.name | |
) # Point to the full table | |
if isinstance(columns, str): # Wrap strings in a list | |
columns = [columns] | |
logger.debug( | |
"Checking provided columns against db columns.", | |
provided=columns, | |
db_columns=db_columns, | |
logger.debug( | |
"Checking provided columns against db columns.", | |
provided=self.columns, | |
db_columns=db_columns, | |
) | |
if not set(self.columns).issubset(db_columns): | |
raise ValueError( | |
f"{set(self.columns).difference(db_columns)} are not columns of {self.fqn}" | |
) | |
def check_columns(self): | |
# Get actual columns of this table from the database | |
db_columns = list( | |
zip( | |
*get_db().fetch( | |
"""SELECT column_name from INFORMATION_SCHEMA.COLUMNS | |
WHERE table_name = %s AND table_schema = %s""", | |
(self.name, self.schema) | |
) | |
) | |
)[0] | |
logger.debug( | |
"Checking provided columns against db columns.", | |
provided=self.columns, | |
db_columns=db_columns, | |
) | |
if not set(self.columns).issubset(db_columns): | |
raise ValueError( | |
f"{set(self.columns).difference(db_columns)} are not columns of {self.fqn}" | |
) |
@pre_flight | ||
def ff_state_machine(self): | ||
# Table is immediately in a 'finished executing' state | ||
q_state_machine = QueryStateMachine( | ||
get_redis(), self.query_id, get_db().conn_id | ||
) | ||
if not q_state_machine.is_completed: | ||
q_state_machine.enqueue() | ||
q_state_machine.execute() | ||
with get_db().engine.begin() as trans: | ||
write_cache_metadata(trans, self, compute_time=0) | ||
q_state_machine.finish() | ||
state, succeeded = q_state_machine.enqueue() | ||
state, succeeded = q_state_machine.execute() | ||
state, succeeded = q_state_machine.finish() | ||
if succeeded: | ||
with get_db().engine.begin() as trans: | ||
write_cache_metadata(trans, self, compute_time=0) | ||
state, succeeded = q_state_machine.finish() | ||
if state != QueryState.COMPLETED: | ||
raise RuntimeError( | ||
f"Couldn't fast forward state machine for table {self}. State is: {state}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Optimise state machine transitions
The state machine implementation has several potential improvements:
- The multiple state transitions could be simplified
- The second
finish()
call appears redundant - Error handling could be more granular
Consider implementing proper error handling for each state transition and simplifying the flow.
@pre_flight
def ff_state_machine(self):
- # Table is immediately in a 'finished executing' state
q_state_machine = QueryStateMachine(
get_redis(), self.query_id, get_db().conn_id
)
if not q_state_machine.is_completed:
- state, succeeded = q_state_machine.enqueue()
- state, succeeded = q_state_machine.execute()
- state, succeeded = q_state_machine.finish()
- if succeeded:
+ try:
+ for transition in [q_state_machine.enqueue, q_state_machine.execute]:
+ state, succeeded = transition()
+ if not succeeded:
+ raise RuntimeError(f"Failed to transition to {state}")
+
with get_db().engine.begin() as trans:
write_cache_metadata(trans, self, compute_time=0)
- state, succeeded = q_state_machine.finish()
- if state != QueryState.COMPLETED:
- raise RuntimeError(
- f"Couldn't fast forward state machine for table {self}. State is: {state}"
- )
+ state, succeeded = q_state_machine.finish()
+ if not succeeded:
+ raise RuntimeError(f"Failed to complete state machine")
+ except Exception as e:
+ logger.error("State machine error", error=str(e))
+ raise
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@pre_flight | |
def ff_state_machine(self): | |
# Table is immediately in a 'finished executing' state | |
q_state_machine = QueryStateMachine( | |
get_redis(), self.query_id, get_db().conn_id | |
) | |
if not q_state_machine.is_completed: | |
q_state_machine.enqueue() | |
q_state_machine.execute() | |
with get_db().engine.begin() as trans: | |
write_cache_metadata(trans, self, compute_time=0) | |
q_state_machine.finish() | |
state, succeeded = q_state_machine.enqueue() | |
state, succeeded = q_state_machine.execute() | |
state, succeeded = q_state_machine.finish() | |
if succeeded: | |
with get_db().engine.begin() as trans: | |
write_cache_metadata(trans, self, compute_time=0) | |
state, succeeded = q_state_machine.finish() | |
if state != QueryState.COMPLETED: | |
raise RuntimeError( | |
f"Couldn't fast forward state machine for table {self}. State is: {state}" | |
) | |
@pre_flight | |
def ff_state_machine(self): | |
q_state_machine = QueryStateMachine( | |
get_redis(), self.query_id, get_db().conn_id | |
) | |
if not q_state_machine.is_completed: | |
try: | |
for transition in [q_state_machine.enqueue, q_state_machine.execute]: | |
state, succeeded = transition() | |
if not succeeded: | |
raise RuntimeError(f"Failed to transition to {state}") | |
with get_db().engine.begin() as trans: | |
write_cache_metadata(trans, self, compute_time=0) | |
state, succeeded = q_state_machine.finish() | |
if not succeeded: | |
raise RuntimeError(f"Failed to complete state machine") | |
except Exception as e: | |
logger.error("State machine error", error=str(e)) | |
raise |
except PreFlightFailedException as exc: | ||
orig_error_msg = exc.args[0] | ||
error_msg = ( | ||
f"Internal flowmachine server error: could not create query object using query schema. " | ||
f"The original error was: '{orig_error_msg}'" | ||
) | ||
raise QueryLoadError( | ||
error_msg, params, orig_error_msg=orig_error_msg, errors=exc.errors | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enhance error handling with exception chaining
The error handling should preserve the original exception context for better debugging.
Apply this diff to implement exception chaining:
except PreFlightFailedException as exc:
orig_error_msg = exc.args[0]
error_msg = (
f"Internal flowmachine server error: could not create query object using query schema. "
f"The original error was: '{orig_error_msg}'"
)
raise QueryLoadError(
error_msg, params, orig_error_msg=orig_error_msg, errors=exc.errors
- )
+ ) from exc
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
except PreFlightFailedException as exc: | |
orig_error_msg = exc.args[0] | |
error_msg = ( | |
f"Internal flowmachine server error: could not create query object using query schema. " | |
f"The original error was: '{orig_error_msg}'" | |
) | |
raise QueryLoadError( | |
error_msg, params, orig_error_msg=orig_error_msg, errors=exc.errors | |
) | |
except PreFlightFailedException as exc: | |
orig_error_msg = exc.args[0] | |
error_msg = ( | |
f"Internal flowmachine server error: could not create query object using query schema. " | |
f"The original error was: '{orig_error_msg}'" | |
) | |
raise QueryLoadError( | |
error_msg, params, orig_error_msg=orig_error_msg, errors=exc.errors | |
) from exc |
🧰 Tools
🪛 Ruff
95-97: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (5)
flowmachine/tests/test_table.py (3)
36-45
: Consider additional preflight test scenariosWhilst the current test cases cover basic failure scenarios, consider adding these additional cases:
- Table with mixed valid and invalid columns
- Table with case-sensitive column names
- Table with duplicate column names
[ dict(name="events.calls", columns=["NO SUCH COLUMN"]), dict(name="NO SUCH TABLE", columns=["id"]), + dict(name="events.calls", columns=["id", "NO SUCH COLUMN"]), + dict(name="events.calls", columns=["ID"]), # case-sensitive check + dict(name="events.calls", columns=["id", "id"]), # duplicates ],
48-51
: Improve test clarity and documentationA few improvements could enhance this test:
- The docstring mentions "user schema" but the test checks for "flowmachine" schema
- The assertion could be more readable
- """Test that where no schema is provided, user schema is checked.""" + """Test that where no schema is provided, flowmachine schema is checked.""" t = Table("gambia_admin2", columns=["geom"]) - assert "flowmachine" == t.schema + assert t.schema == "flowmachine"🧰 Tools
🪛 Ruff
51-51: Yoda condition detected
Rewrite as
t.schema == "flowmachine"
(SIM300)
Line range hint
1-117
: Consider adding tests for database connection behaviourGiven that a key objective of this PR is to eliminate the need for database connections during Query initialization, consider adding explicit tests to verify:
- Table instantiation without database connection
- Lazy database connection behaviour
- Connection state during preflight checks
Would you like assistance in implementing these additional test cases?
🧰 Tools
🪛 Ruff
7-7:
flowmachine.core.errors.flowmachine_errors.QueryErroredException
imported but unusedRemove unused import:
flowmachine.core.errors.flowmachine_errors.QueryErroredException
(F401)
51-51: Yoda condition detected
Rewrite as
t.schema == "flowmachine"
(SIM300)
flowmachine/flowmachine/core/infrastructure_table.py (2)
34-62
: Add column type documentation.Consider adding type information and constraints for columns (e.g., valid ranges for height, azimuth, tilt angles) to help prevent data integrity issues.
Example documentation format:
class CellsTable(InfrastructureTable): """Table containing cell site information. Columns: height (float): Height in metres (must be positive) azimuth (float): Antenna azimuth in degrees (0-360) electrical_tilt (float): Electrical downtilt in degrees mechanical_downtilt (float): Mechanical downtilt in degrees ... """
64-123
: Follow boolean naming convention for flag columns.Consider prefixing boolean columns with "is_" or "has_" to make their type more obvious. This is particularly relevant for the capability flags.
Example changes:
- "hardware_gprs", + "has_hardware_gprs", - "hardware_wifi", + "has_hardware_wifi", - "mms_receiver", + "has_mms_receiver",
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
- flowmachine/flowmachine/core/events_table.py (1 hunks)
- flowmachine/flowmachine/core/flowdb_table.py (1 hunks)
- flowmachine/flowmachine/core/infrastructure_table.py (1 hunks)
- flowmachine/tests/test_table.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- flowmachine/flowmachine/core/events_table.py
- flowmachine/flowmachine/core/flowdb_table.py
🧰 Additional context used
🪛 Ruff
flowmachine/tests/test_table.py
7-7:
flowmachine.core.errors.flowmachine_errors.QueryErroredException
imported but unusedRemove unused import:
flowmachine.core.errors.flowmachine_errors.QueryErroredException
(F401)
51-51: Yoda condition detected
Rewrite as
t.schema == "flowmachine"
(SIM300)
🔇 Additional comments (4)
flowmachine/tests/test_table.py (4)
6-9
: Previous comment about unused import remains validThe existing review comment about removing the unused
QueryErroredException
import is still applicable.🧰 Tools
🪛 Ruff
7-7:
flowmachine.core.errors.flowmachine_errors.QueryErroredException
imported but unusedRemove unused import:
flowmachine.core.errors.flowmachine_errors.QueryErroredException
(F401)
12-17
: Previous comment about expanding test cases remains validThe existing review comment suggesting additional test cases for column name verification is still applicable.
19-34
: Well-structured parameterised test implementationThe test cases effectively cover various invalid table creation scenarios, with proper error handling validation.
Line range hint
53-117
: Well-structured test suite with comprehensive coverageThe test implementations consistently handle the columns parameter and provide good coverage of Table functionality.
class InfrastructureTable(FlowDBTable): | ||
def __init__(self, *, name: str, columns: Optional[list[str]] = None) -> None: | ||
super().__init__(schema="infrastructure", name=name, columns=columns) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider parameterising the schema name.
The schema name "infrastructure" is hardcoded in the constructor. Consider making it configurable through class initialisation or environment variables for better flexibility in different environments.
class InfrastructureTable(FlowDBTable):
- def __init__(self, *, name: str, columns: Optional[list[str]] = None) -> None:
- super().__init__(schema="infrastructure", name=name, columns=columns)
+ SCHEMA_NAME = "infrastructure" # Can be overridden or loaded from config
+
+ def __init__(self, *, name: str, columns: Optional[list[str]] = None) -> None:
+ super().__init__(schema=self.SCHEMA_NAME, name=name, columns=columns)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class InfrastructureTable(FlowDBTable): | |
def __init__(self, *, name: str, columns: Optional[list[str]] = None) -> None: | |
super().__init__(schema="infrastructure", name=name, columns=columns) | |
class InfrastructureTable(FlowDBTable): | |
SCHEMA_NAME = "infrastructure" # Can be overridden or loaded from config | |
def __init__(self, *, name: str, columns: Optional[list[str]] = None) -> None: | |
super().__init__(schema=self.SCHEMA_NAME, name=name, columns=columns) |
super().__init__(name="tacs", columns=columns) | ||
|
||
|
||
infrastructure_table_map = dict(tacs=TacsTable, cells=CellsTable, sites=SitesTable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance type safety of table mapping.
Consider using an Enum for table names and adding type hints to prevent errors from typos or invalid table names.
+from enum import Enum, auto
+from typing import Dict, Type
+
+class InfrastructureTableType(Enum):
+ TACS = "tacs"
+ CELLS = "cells"
+ SITES = "sites"
+
-infrastructure_table_map = dict(tacs=TacsTable, cells=CellsTable, sites=SitesTable)
+infrastructure_table_map: Dict[str, Type[InfrastructureTable]] = {
+ InfrastructureTableType.TACS.value: TacsTable,
+ InfrastructureTableType.CELLS.value: CellsTable,
+ InfrastructureTableType.SITES.value: SitesTable,
+}
Committable suggestion was skipped due to low confidence.
class SitesTable(InfrastructureTable): | ||
all_columns = [ | ||
"site_id", | ||
"id", | ||
"version", | ||
"name", | ||
"type", | ||
"status", | ||
"structure_type", | ||
"is_cow", | ||
"date_of_first_service", | ||
"date_of_last_service", | ||
"geom_point", | ||
"geom_polygon", | ||
] | ||
|
||
def __init__(self, *, columns: Optional[list[str]] = None) -> None: | ||
super().__init__(name="sites", columns=columns) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add column name validation in constructor.
The constructor accepts optional columns but doesn't validate if they exist in all_columns
. Consider adding validation to prevent runtime errors.
def __init__(self, *, columns: Optional[list[str]] = None) -> None:
+ if columns is not None and not set(columns).issubset(self.all_columns):
+ invalid_columns = set(columns) - set(self.all_columns)
+ raise ValueError(f"Invalid columns specified: {invalid_columns}")
super().__init__(name="sites", columns=columns)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class SitesTable(InfrastructureTable): | |
all_columns = [ | |
"site_id", | |
"id", | |
"version", | |
"name", | |
"type", | |
"status", | |
"structure_type", | |
"is_cow", | |
"date_of_first_service", | |
"date_of_last_service", | |
"geom_point", | |
"geom_polygon", | |
] | |
def __init__(self, *, columns: Optional[list[str]] = None) -> None: | |
super().__init__(name="sites", columns=columns) | |
class SitesTable(InfrastructureTable): | |
all_columns = [ | |
"site_id", | |
"id", | |
"version", | |
"name", | |
"type", | |
"status", | |
"structure_type", | |
"is_cow", | |
"date_of_first_service", | |
"date_of_last_service", | |
"geom_point", | |
"geom_polygon", | |
] | |
def __init__(self, *, columns: Optional[list[str]] = None) -> None: | |
if columns is not None and not set(columns).issubset(self.all_columns): | |
invalid_columns = set(columns) - set(self.all_columns) | |
raise ValueError(f"Invalid columns specified: {invalid_columns}") | |
super().__init__(name="sites", columns=columns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LlamaPReview Suggested Improvements:
[Repeat this block for each suggested improvement]
```diff
@@ -0,0 +1,121 @@
+from typing import Optional
+
+class InfrastructureTable(FlowDBTable):
+ def __init__(self, *, name: str, columns: Optional[list[str]] = None) -> None:
+ super().__init__(schema="infrastructure", name=name, columns=columns)
+
+
+class SitesTable(InfrastructureTable):
+ all_columns = [
+ "site_id",
+ "id",
+ "version",
+ "name",
+ "type",
+ "status",
+ "structure_type",
+ "is_cow",
+ "date_of_first_service",
+ "date_of_last_service",
+ "geom_point",
+ "geom_polygon",
+ ]
+
+ def __init__(self, *, columns=None):
+ super().__init__(name="sites", columns=columns)
+
+
+class CellsTable(InfrastructureTable):
+ all_columns = [
+ "cell_id",
+ "id",
+ "version",
+ "site_id",
+ "name",
+ "type",
+ "msc",
+ "bsc_rnc",
+ "antenna_type",
+ "status",
+ "lac",
+ "height",
+ "azimuth",
+ "transmitter",
+ "max_range",
+ "min_range",
+ "electrical_tilt",
+ "mechanical_downtilt",
+ "date_of_first_service",
+ "date_of_last_service",
+ "geom_point",
+ "geom_polygon",
+ ]
+
+ def __init__(self, *, columns=None):
+ super().__init__(name="cells", columns=columns)
+
+
+class TacsTable(InfrastructureTable):
+ all_columns = [
+ "id",
+ "brand",
+ "model",
+ "width",
+ "height",
+ "depth",
+ "weight",
+ "display_type",
+ "display_colors",
+ "display_width",
+ "display_height",
+ "mms_receiver",
+ "mms_built_in_camera",
+ "wap_push_ota_support",
+ "hardware_gprs",
+ "hardware_edge",
+ "hardware_umts",
+ "hardware_wifi",
+ "hardware_bluetooth",
+ "hardware_gps",
+ "software_os_vendor",
+ "software_os_name",
+ "software_os_version",
+ "wap_push_ota_settings",
+ "wap_push_ota_bookmarks",
+ "wap_push_ota_app_internet",
+ "wap_push_ota_app_browser",
+ "wap_push_ota_app_mms",
+ "wap_push_ota_single_shot",
+ "wap_push_ota_multi_shot",
+ "wap_push_oma_settings",
+ "wap_push_oma_app_internet",
+ "wap_push_oma_app_browser",
+ "wap_push_oma_app_mms",
+ "wap_push_oma_cp_bookmarks",
+ "wap_1_2_1",
+ "wap_2_0",
+ "syncml_dm_settings",
+ "syncml_dm_acc_gprs",
+ "syncml_dm_app_internet",
+ "syncml_dm_app_browser",
+ "syncml_dm_app_mms",
+ "syncml_dm_app_bookmark",
+ "syncml_dm_app_java",
+ "wap_push_oma_app_ims",
+ "wap_push_oma_app_poc",
+ "j2me_midp_10",
+ "j2me_midp_20",
+ "j2me_midp_21",
+ "j2me_cldc_10",
+ "j2me_cldc_11",
+ "j2me_cldc_20",
+ "j2me_cldc_21",
+ "j2me_cldc_22",
+ "j2me_ata_10",
+ "j2me_ata_20",
+ "j2me_ata_21",
+ "j2me_ata_22",
+ "j2me_ata_30",
+ "j2me_ata_31",
+ "j2me_ata_32",
+ "j2me_ata_40",
+ "j2me_ata_41",
+ "j2me_ata_42",
+ "j2me_ata_50",
+ "j2me_ata_51",
+ "j2me_ata_52",
+ "j2me_ata_60",
+ "j2me_ata_61",
+ "j2me_ata_62",
+ "j2me_ata_70",
+ "j2me_ata_71",
+ "j2me_ata_72",
+ "j2me_ata_80",
+ "j2me_ata_81",
+ "j2me_ata_82",
+ "j2me_ata_90",
+ "j2me_ata_91",
+ "j2me_ata_92",
+ "j2me_ata_100",
+ "j2me_ata_101",
+ "j2me_ata_102",
+ "j2me_ata_110",
+ "j2me_ata_111",
+ "j2me_ata_112",
+ "j2me_ata_120",
+ "j2me_ata_121",
+ "j2me_ata_122",
+ "j2me_ata_130",
+ "j2me_ata_131",
+ "j2me_ata_132",
+ "j2me_ata_140",
+ "j2me_ata_141",
+ "j2me_ata_142",
+ "j2me_ata_150",
+ "j2me_ata_151",
+ "j2me_ata_152",
+ "j2me_ata_160",
+ "j2me_ata_161",
+ "j2me_ata_162",
+ "j2me_ata_170",
+ "j2me_ata_171",
+ "j2me_ata_172",
+ "j2me_ata_180",
+ "j2me_ata_181",
+ "j2me_ata_182",
+ "j2me_ata_190",
+ "j2me_ata_191",
+ "j2me_ata_192",
+ "j2me_ata_200",
+ "j2me_ata_201",
+ "j2me_ata_202",
+ "j2me_ata_210",
+ "j2me_ata_211",
+ "j2me_ata_212",
+ "j2me_ata_220",
+ "j2me_ata_221",
+ "j2me_ata_222",
+ "j2me_ata_230",
+ "j2me_ata_231",
+ "j2me_ata_232",
+ "j2me_ata_240",
+ "j2me_ata_241",
+ "j2me_ata_242",
+ "j2me_ata_250",
+ "j2me_ata_251",
+ "j2me_ata_252",
+ "j2me_ata_260",
+ "j2me_ata_261",
+ "j2me_ata_262",
+ "j2me_ata_270",
+ "j2me_ata_271",
+ "j2me_ata_272",
+ "j2me_ata_280",
+ "j2me_ata_281",
+ "j2me_ata_282",
+ "j2me_ata_290",
+ "j2me_ata_291",
+ "j2me_ata_292",
+ "j2me_ata_300",
+ "j2me_ata_301",
+ "j2me_ata_302",
+ "j2me_ata_310",
+ "j2me_ata_311",
+ "j2me_ata_312",
+ "j2me_ata_320",
+ "j2me_ata_321",
+ "j2me_ata_322",
+ "j2me_ata_330",
+ "j2me_ata_331",
+ "j2me_ata_332",
+ "j2me_ata_340",
+ "j2me_ata_341",
+ "j2me_ata_342",
+ "j2me_ata_350",
+ "j2me_ata_351",
+ "j2me_ata_352",
+ "j2me_ata_360",
+ "j2me_ata_361",
+ "j2me_ata_362",
+ "j2me_ata_370",
+ "j2me_ata_371",
+ "j2me_ata_372",
+ "j2me_ata_380",
+ "j2me_ata_381",
+ "j2me_ata_382",
+ "j2me_ata_390",
+ "j2me_ata_391",
+ "j2me_ata_392",
+ "j2me_ata_400",
+ "j2me_ata_401",
+ "j2me_ata_402",
+ "j2me_ata_410",
+ "j2me_ata_411",
+ "j2me_ata_412",
+ "j2me_ata_420",
+ "j2me_ata_421",
+ "j2me_ata_422",
+ "j2me_ata_430",
+ "j2me_ata_431",
+ "j2me_ata_432",
+ "j2me_ata_440",
+ "j2me_ata_441",
+ "j2me_ata_442",
+ "j2me_ata_450",
+ "j2me_ata_451",
+ "j2me_ata_452",
+ "j2me_ata_460",
+ "j2me_ata_461",
+ "j2me_ata_462",
+ "j2me_ata_470",
+ "j2me_ata_471",
+ "j2me_ata_472",
+ "j2me_ata_480",
+ "j2me_ata_481",
+ "j2me_ata_482",
+ "j2me_ata_490",
+ "j2me_ata_491",
+ "j2me_ata_492",
+ "j2me_ata_500",
+ "j2me_ata_501",
+ "j2me_ata_502",
+ "j2me_ata_510",
+ "j2me_ata_511",
+ "j2me_ata_512",
+ "j2me_ata_520",
+ "j2me_ata_521",
+ "j2me_ata_522",
+ "j2me_ata_530",
+ "j2me_ata_531",
+ "j2me_ata_532",
+ "j2me_ata_540",
+ "j2me_ata_541",
+ "j2me_ata_542",
+ "j2me_ata_550",
+ "j2me_ata_551",
+ "j2me_ata_552",
+ "j2me_ata_560",
+ "j2me_ata_561",
+ "j2me_ata_562",
+ "j2me_ata_570",
+ "j2me_ata_571",
+ "j2me_ata_572",
+ "j2me_ata_580",
+ "j2me_ata_581",
+ "j2me_ata_582",
+ "j2me_ata_590",
+ "j2me_ata_591",
+ "j2me_ata_592",
+ "j2me_ata_600",
+ "j2me_ata_601",
+ "j2me_ata_602",
+ "j2me_ata_610",
+ "j2me_ata_611",
+ "j2me_ata_612",
+ "j2me_ata_620",
+ "j2me_ata_621",
+ "j2me_ata_622",
+ "j2me_ata_630",
+ "j2me_ata_631",
+ "j2me_ata_632",
+ "j2me_ata_640",
+ "j2me_ata_641",
+ "j2me_ata_642",
+ "j2me_ata_650",
+ "j2me_ata_651",
+ "j2me_ata_652",
+ "j2me_ata_660",
+ "j2me_ata_661",
+ "j2me_ata_662",
+ "j2me_ata_670",
+ "j2me_ata_671",
+ "j2me_ata_672",
+ "j2me_ata_680",
+ "j2me_ata_681",
+ "j2me_ata_682",
+ "j2me_ata_690",
+ "j2me_ata_691",
+ "j2me_ata_692",
+ "j2me_ata_700",
+ "j2me_ata_701",
+ "j2me_ata_702",
+ "j2me_ata_710",
+ "j2me_ata_711",
+ "j2me_ata_712",
+ "j2me_ata_720",
+ "j2me_ata_721",
+ "j2me_ata_722",
+ "j2me_ata_730",
+ "j2me_ata_731",
+ "j2me_ata_732",
+ "j2me_ata_740",
+ "j2me_ata_741",
+ "j2me_ata_742",
+ "j2me_ata_750",
+ "j2me_ata_751",
+ "j2me_ata_752",
+ "j2me_ata_760",
+ "j2me_ata_761",
+ "j2me_ata_762",
+ "j2me_ata_770",
+ "j2me_ata_771",
+ "j2me_ata_772",
+ "j2me_ata_780",
+ "j2me_ata_781",
+ "j2me_ata_782",
+ "j2me_ata_790",
+ "j2me_ata_791",
+ "j2me_ata_792",
+ "j2me_ata_800",
+ "j2me_ata_801",
+ "j2me_ata_802",
+ "j2me_ata_810",
+ "j2me_ata_811",
+ "j2me_ata_812",
+ "j2me_ata_820",
+ "j2me_ata_821",
+ "j2me_ata_822",
+ "j2me_ata_830",
+ "j2me_ata_831",
+ "j2me_ata_832",
+ "j2me_ata_840",
+ "j2me_ata_841",
+ "j2me_ata_842",
+ "j2me_ata_850",
+ "j2me_ata_851",
+ "j2me_ata_852",
+ "j2me_ata_860",
+ "j2me_ata_861",
+ "j2me_ata_862",
+ "j2me_ata_870",
+ "j2me_ata_871",
+ "j2me_ata_872",
+ "j2me_ata_880",
+ "j2me_ata_881",
+ "j2me_ata_882",
+ "j2me_ata_890",
+ "j2me_ata_891",
+ "j2me_ata_892",
+ "j2me_ata_900",
+ "j2me_ata_901",
+ "j2me_ata_902",
+ "j2me_ata_910",
+ "j2me_ata_911",
+ "j2me_ata_912",
+ "j2me_ata_920",
+ "j2me_ata_921",
+ "j2me_ata_922",
+ "j2me_ata_930",
+ "j2me_ata_931",
+ "j2me_ata_932",
+ "j2me_ata_940",
+ "j2me_ata_941",
+ "j2me_ata_942",
+ "j2me_ata_950",
+ "j2me_ata_951",
+ "j2me_ata_952",
+ "j2me_ata_960",
+ "j2me_ata_961",
+ "j2me_ata_962",
+ "j2me_ata_970",
+ "j2me_ata_971",
+ "j2me_ata_972",
+ "j2me_ata_980",
+ "j2me_ata_981",
+ "j2me_ata_982",
+ "j2me_ata_990",
+ "j2me_ata_991",
+ "j2me_ata_992",
+ "j2me_ata_1000",
+ "j2me_ata_1001",
+ "j2me_ata_1002",
+ "j2me_ata_1010",
+ "j2me_ata_1011",
+ "j2me_ata_1012",
+ "j2me_ata_1020",
+ "j2me_ata_1021",
+ "j2me_ata_1022",
+ "j2me_ata_1030",
+ "j2me_ata_1031",
+ "j2me_ata_1032",
+ "j2me_ata_1040",
+ "j2me_ata_1041",
+ "j2me_ata_1042",
+ "j2me_ata_1050",
+ "j2me_ata_1051",
+ "j2me_ata_1052",
+ "j2me_ata_1060",
+ "j2me_ata_1061",
+ "j2me_ata_1062",
+ "j2me_ata_1070",
+ "j2me_ata_1071",
+ "j2me_ata_1072",
+ "j2me_ata_1080",
+ "j2me_ata_1081",
+ "j2me_ata_1082",
+ "j2me_ata_1090",
+ "j2me_ata_1091",
+ "j2me_ata_1092",
+ "j2me_ata_1100",
+ "j2me_ata_1101",
+ "j2me_ata_1102",
+ "j2me_ata_1110",
+ "j2me_ata_1111",
+ "j2me_ata_1112",
+ "j2me_ata_1120",
+ "j2me_ata_1121",
+ "j2me_ata_1122",
+ "j2me_ata_1130",
+ "j2me_ata_1131",
+ "j2me_ata_1132",
+ "j2me_ata_1140",
+ "j2me_ata_1141",
+ "j2me_ata_1142",
+ "j2me_ata_1150",
+ "j2me_ata_1151",
+ "j2me_ata_1152",
+ "j2me_ata_1160",
+ "j2me_ata_1161",
+ "j2me_ata_1162",
+ "j2me_ata_1170",
+ "j2me_ata_1171",
+ "j2me_ata_1172",
+ "j2me_ata_1180",
+ "j2me_ata_1181",
+ "j2me_ata_1182",
+ "j2me_ata_1190",
+ "j2me_ata_1191",
+ "j2me_ata_1192",
+ "j2me_ata_1200",
+ "j2me_ata_1201",
+ "j2me_ata_1202",
+ "j2me_ata_1210",
+ "j2me_ata_1211",
+ "j2me_ata_1212",
+ "j2me_ata_1220",
+ "j2me_ata_1221",
+ "j2me_ata_1222",
+ "j2me_ata_1230",
+ "j2me_ata_1231",
+ "j2me_ata_1232",
+ "j2me_ata_1240",
+ "j2me_ata_1241",
+ "j2me_ata_1242",
+ "j2me_ata_1250",
+ "j2me_ata_1251",
+ "j2me_ata_1252",
+ "j2me_ata_1260",
+ "j2me_ata_1261",
+ "j2me_ata_1262",
+ "j2me_ata_1270",
+ "j2me_ata_1271",
+ "j2me_ata_1272",
+ "j2me_ata_1280",
+ "j2me_ata_1281",
+ "j2me_ata_1282",
+ "j2me_ata_1290",
+ "j2me_ata_1291",
+ "j2me_ata_1292",
+ "j2me_ata_1300",
+ "j2me_ata_1301",
+ "j2me_ata_1302",
+ "j2me_ata_1310",
+ "j2me_ata_1311",
+ "j2me_ata_1312",
+ "j2me_ata_1320",
+ "j2me_ata_1321",
+ "j2me_ata_1322",
+ "j2me_ata_1330",
+ "j2me_ata_1331",
+ "j2me_ata_1332",
+ "j2me_ata_1340",
+ "j2me_ata_1341",
+ "j2me_ata_1342",
+ "j2me_ata_1350",
+ "j2me_ata_1351",
+ "j2me_ata_1352",
+ "j2me_ata_1360",
+ "j2me_ata_1361",
+ "j2me_ata_1362",
+ "j2me_ata_1370",
+ "j2me_ata_1371",
+ "j2me_ata_1372",
+ "j2me_ata_1380",
+ "j2me_ata_1381",
+ "j2me_ata_1382",
+ "j2me_ata_1390",
+ "j2me_ata_1391",
+ "j2me_ata_1392",
+ "j2me_ata_1400",
+ "j2me_ata_1401",
+ "j2me_ata_1402",
+ "j2me_ata_1410",
+ "j2me_ata_1411",
+ "j2me_ata_1412",
+ "j2me_ata_1420",
+ "j2me_ata_1421",
+ "j2me_ata_14
Closes #390 (which was already closed, but this closes it even more)
I have:
Description
Removes the use of a database connection during query init:
@pre_flight
decorator whichQuery
subclasses may use to indicate a method should be run to confirm the query is runnablepreflight()
method to query, which calls all applicable pre-flight check methods for the query__init__
, any checks that require database access must now be implemented as a method of the class and use the@pre_flight
decoratorevents.
prefix is no longer required.Table
no longer automatically infers columns from the database, they must be specified.For now, pre_flight is triggered during an actual cache write of a query, and also right after creating the query object to be stored. This preserves the existing behaviour of queries that can't actually run erroring at submission in the API.
I think in future we'd want to move away from that, because db checks are relatively expensive and slow down query creation when creating lots of queries via the API. Is possible however that switching out the existing sqlalchemy connection to Postgres for an asyncio one would alleviate that quite a bit. I guess we could also introduce two distinct connection pools as well - one for babysitting executing queries, and a separate one for pre-flight checks.
Summary:
Refactors query initialization in Flowmachine to improve performance by removing automatic database checks during object creation, using a new
@pre_flight
decorator andpreflight()
method.Key points:
@pre_flight
decorator for methods requiring DB checks.preflight()
method inQuery
class to trigger these checks.Query
initialization.Generated with ❤️ by ellipsis.dev
Summary by CodeRabbit
Release Notes
New Features
@pre_flight
decorator forQuery
subclasses to validate query runnability.preflight()
method in theQuery
class for executing pre-flight checks.zero_cache
table.Changed Functionality
TotalActivePeriodsSubscriber
class.Table
instances across multiple test cases to include explicit column definitions.Bug Fixes
Documentation
CHANGELOG.md
to reflect notable changes in the FlowKit project.These updates enhance functionality, improve error handling, and ensure better documentation for users.