-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove database connection during query init #3128
base: master
Are you sure you want to change the base?
Changes from 72 commits
ba7f5c3
7978db5
0076706
fd1d05c
1428e83
370a92c
aa09330
4adaeac
3056640
3446e70
4d557d8
1b4b901
a9e0757
b6c8413
f4802a8
5a75692
9e21ef2
26c964d
6e556ac
93e54e1
3a9b2ae
85be37c
d33ec65
0045814
b47d344
a86895c
f809daa
43faa71
bc7342e
8c7bf49
d97c76f
a4e9693
7f464ff
07dc063
68b2cc9
9c13057
b72b207
adadf5b
aef870e
1bce844
d6d5920
86e6c22
1efdd5e
c20f4a9
3f8ed90
96ca0eb
d9d08e7
c3507a4
4baea78
0c6f125
1b78892
c048210
8c50b3b
61ed2fd
5a6bb11
d4a2e85
102dbf5
faebea2
f906ed0
e2434bc
23b2728
8f2d3c5
2584008
6888c06
6c1cf3b
0b5cf0c
db6c404
95bf78b
8d1eb11
0ce6a55
dbbca00
a6bfdd8
aece3d9
f317937
0de3df1
137d886
ddd660d
a57e843
8a09e6c
37de99e
1d77308
52c0bdc
302abc7
e1510e0
267fac5
36b6b3e
fea6c74
489c94e
22fd099
9a60732
bb04fb8
127d2e9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
from flowmachine.core.flowdb_table import FlowDBTable | ||
|
||
|
||
class EventsTable(FlowDBTable): | ||
def __init__(self, *, name, columns): | ||
super().__init__(schema="events", name=name, columns=columns) | ||
|
||
|
||
class CallsTable(EventsTable): | ||
all_columns = [ | ||
"id", | ||
"outgoing", | ||
"datetime", | ||
"duration", | ||
"network", | ||
"msisdn", | ||
"msisdn_counterpart", | ||
"location_id", | ||
"imsi", | ||
"imei", | ||
"tac", | ||
"operator_code", | ||
"country_code", | ||
] | ||
Comment on lines
+15
to
+29
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion Refactor duplicate column definitions. The # Create a base class for common columns
class CommonEventsTable(EventsTable):
all_columns = [
"id",
"outgoing",
"datetime",
"network",
"msisdn",
"msisdn_counterpart",
"location_id",
"imsi",
"imei",
"tac",
"operator_code",
"country_code",
]
class CallsTable(CommonEventsTable):
def __init__(self, *, columns=None):
super().__init__(name="calls", columns=columns)
# Similar changes for ForwardsTable and SmsTable Also applies to: 31-44, 51-64 |
||
|
||
def __init__(self, *, columns=None): | ||
super().__init__(name="calls", columns=columns) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Standardise the constructor parameter requirements. The
This inconsistency could lead to confusion and maintenance issues. Consider standardising the constructor signatures. If - def __init__(self, *, columns):
+ def __init__(self, *, columns=None): Also applies to: 50-51, 70-71, 91-92, 114-115 |
||
|
||
|
||
class ForwardsTable(EventsTable): | ||
all_columns = [ | ||
"id", | ||
"outgoing", | ||
"datetime", | ||
"network", | ||
"msisdn", | ||
"msisdn_counterpart", | ||
"location_id", | ||
"imsi", | ||
"imei", | ||
"tac", | ||
"operator_code", | ||
"country_code", | ||
] | ||
|
||
def __init__(self, *, columns=None): | ||
super().__init__(name="forwards", columns=columns) | ||
|
||
|
||
class SmsTable(EventsTable): | ||
all_columns = [ | ||
"id", | ||
"outgoing", | ||
"datetime", | ||
"network", | ||
"msisdn", | ||
"msisdn_counterpart", | ||
"location_id", | ||
"imsi", | ||
"imei", | ||
"tac", | ||
"operator_code", | ||
"country_code", | ||
] | ||
|
||
def __init__(self, *, columns): | ||
super().__init__(name="sms", columns=columns) | ||
|
||
|
||
class MdsTable(EventsTable): | ||
all_columns = [ | ||
"id", | ||
"datetime", | ||
"duration", | ||
"volume_total", | ||
"volume_upload", | ||
"volume_download", | ||
"msisdn", | ||
"location_id", | ||
"imsi", | ||
"imei", | ||
"tac", | ||
"operator_code", | ||
"country_code", | ||
] | ||
|
||
def __init__(self, *, columns): | ||
super().__init__(name="mds", columns=columns) | ||
|
||
|
||
class TopupsTable(EventsTable): | ||
all_columns = [ | ||
"id", | ||
"datetime", | ||
"type", | ||
"recharge_amount", | ||
"airtime_fee", | ||
"tax_and_fee", | ||
"pre_event_balance", | ||
"post_event_balance", | ||
"msisdn", | ||
"location_id", | ||
"imsi", | ||
"imei", | ||
"tac", | ||
"operator_code", | ||
"country_code", | ||
] | ||
|
||
def __init__(self, *, columns): | ||
super().__init__(name="topups", columns=columns) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion Consider using composition to manage column definitions. The current implementation has significant duplication in column definitions. Consider restructuring using composition to improve maintainability: class ColumnSets:
COMMON = [
"id",
"datetime",
"location_id",
"msisdn",
]
DEVICE = [
"imsi",
"imei",
"tac",
]
NETWORK = [
"operator_code",
"country_code",
]
INTERACTION = [
"msisdn_counterpart",
"network",
"outgoing",
]
class CallsTable(EventsTable):
all_columns = [
*ColumnSets.COMMON,
*ColumnSets.DEVICE,
*ColumnSets.NETWORK,
*ColumnSets.INTERACTION,
"duration",
] This approach would:
|
||
|
||
|
||
events_table_map = dict( | ||
calls=CallsTable, | ||
sms=SmsTable, | ||
mds=MdsTable, | ||
topups=TopupsTable, | ||
forwards=ForwardsTable, | ||
) |
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,20 @@ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
from abc import ABCMeta | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
from flowmachine.core.table import Table | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
class FlowDBTable(Table, metaclass=ABCMeta): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
def __init__(self, *, name, schema, columns): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
if columns is None: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
columns = self.all_columns | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
if set(columns).issubset(self.all_columns): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
super().__init__(schema=schema, name=name, columns=columns) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
else: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
raise ValueError( | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
f"Columns {columns} must be a subset of {self.all_columns}" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Several improvements needed in constructor implementation. The constructor has the following issues:
Consider applying these improvements: - def __init__(self, *, name, schema, columns):
+ def __init__(self, *, name: str, schema: str, columns: list[str] | None = None) -> None:
+ """
+ Initialize a FlowDB table with specified name, schema, and columns.
+
+ Args:
+ name: The name of the table
+ schema: The database schema containing the table
+ columns: List of column names. If None, uses all available columns
+
+ Raises:
+ ValueError: If provided columns are not valid or if name/schema are empty
+ AttributeError: If all_columns is not defined in the subclass
+ """
+ if not name or not schema:
+ raise ValueError("Both name and schema must be non-empty strings")
+
+ if not hasattr(self, 'all_columns'):
+ raise AttributeError("Subclass must define all_columns")
+
if columns is None:
columns = self.all_columns
if set(columns).issubset(self.all_columns):
super().__init__(schema=schema, name=name, columns=columns)
else:
raise ValueError(
- f"Columns {columns} must be a subset of {self.all_columns}"
+ f"Invalid columns: {set(columns) - set(self.all_columns)}. "
+ f"Must be a subset of: {self.all_columns}"
) 📝 Committable suggestion
Suggested change
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
@property | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
def all_columns(self): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
raise NotImplementedError |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,6 +6,7 @@ | |
""" | ||
Simple utility class that represents tables with geometry. | ||
""" | ||
from typing import Optional, List | ||
|
||
from . import Table | ||
from .mixins import GeoDataMixin | ||
|
@@ -47,19 +48,25 @@ class GeoTable(GeoDataMixin, Table): | |
""" | ||
|
||
def __init__( | ||
self, name=None, schema=None, columns=None, geom_column="geom", gid_column=None | ||
self, | ||
name: str, | ||
*, | ||
schema: Optional[str] = None, | ||
columns: List[str], | ||
geom_column: str = "geom", | ||
gid_column: Optional[str] = None, | ||
): | ||
self.geom_column = geom_column | ||
self.gid_column = gid_column | ||
super().__init__(name=name, schema=schema, columns=columns) | ||
if geom_column not in self.column_names: | ||
if self.geom_column not in columns: | ||
raise ValueError( | ||
"geom_column: {} is not a column in this table.".format(geom_column) | ||
f"geom_column: {self.geom_column} is not a column in this table." | ||
) | ||
if gid_column is not None and gid_column not in self.column_names: | ||
if self.gid_column is not None and self.gid_column not in columns: | ||
raise ValueError( | ||
"gid_column: {} is not a column in this table.".format(gid_column) | ||
f"gid_column: {self.gid_column} is not a column in this table." | ||
) | ||
super().__init__(name=name, schema=schema, columns=columns) | ||
Comment on lines
+51
to
+69
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Codebase verification Based on the comprehensive results from our verification scripts, I can now provide a final response: Required The codebase analysis shows that:
The change to make
🔗 Analysis chainVerify impact of required columns parameter The change to make Let me generate additional scripts to get a more comprehensive view of the codebase. The previous results showed test cases that have been updated with the required 🏁 Scripts executedThe following scripts were executed for the analysis: Script: #!/bin/bash
# Search for GeoTable instantiations that might need updating
rg "GeoTable\(" --type py -B 2 -A 2
# Look for any existing tests to ensure they've been updated
rg "test.*GeoTable" --type py
Length of output: 2844 Script: #!/bin/bash
# Search for GeoTable instantiations in the entire codebase
ast-grep --pattern 'GeoTable($$$)'
# Search for any imports of GeoTable to find all files that might use it
rg "from.*import.*GeoTable" --type py
# Look for any class that might inherit from GeoTable
ast-grep --pattern 'class $_ ($$$GeoTable$$$)'
Length of output: 7209 |
||
|
||
def _geo_augmented_query(self): | ||
if self.gid_column is None: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider optimizing the cache score calculation performance.
The subquery
(SELECT array_agg(object_class) as classes FROM cache.zero_cache)
will be executed on every cache touch operation. For better performance, consider:Example optimization using a JOIN:
📝 Committable suggestion