Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code quality guidelines #10

Open
chrisconlan opened this issue Jan 25, 2021 · 7 comments
Open

Code quality guidelines #10

chrisconlan opened this issue Jan 25, 2021 · 7 comments
Assignees

Comments

@chrisconlan
Copy link
Member

chrisconlan commented Jan 25, 2021

Here are some things I noticed while reviewing code that should be made rules.

  • All function names should be verbs.
  • A leading underscore e.g. _some_function can denote a private method, meaning it won't be imported from from qttk import * imports and users know that it is not intended for them.
  • Use an indent of 4 spaces
  • The @time_this decorator should not wrap functions that will be imported by users. Write your code accordingly.
  • Class definitions are title case e.g. MyClassName, while function names are snake case e.g. my_function_name. Variable names are also snake case e.g. my_variable_name.
  • Input arguments to functions should not be modified within the function call. This is especially important when dealing with data frames.
  • Don't assume you know the user's working directory. Always establish a path to external files relative to the script or module that is importing them, e.g. os.path.join(os.path.dirname(__file__), '..', 'my_data.csv')
  • Python filenames should all be snakecase and should not contain numbers.
  • We're programming clear and clean code, where the code itself serves as documentation. Think about the end-user's experience of exploring the source code when organizing the repo.
  • Use assert statements if __name__ == '__main__': clauses to test code within the same file the code is written.
  • Regarding the above, no files should have the word test in them, because that implies adherence to a different testing framework. It is also inappropriate for users to import files with the word test in them.
  • Keep it DRY (don't repeat yourself). There should be a single source of truth for each distinct functionality in the repo. If there are multiple versions of a function, which we expect due to our profiling work, it should clear which one is optimal and which one is recommended for production use.
  • Date-indexed pandas objects are the backbone data structure of our project.
    • If a function takes a pandas object in, it should generally return a pandas object with the same index. Avoid unnecessarily manipulating the index of the input.
    • Don't return unnecessarily complex objects. Return a pd.Series if possible. Return a pd.DataFrame if necessary.
    • Name the series when helpful. Name the columns of a data frame when helpful.
      • e.g. my_series.name = 'rsi'
      • e.g. my_df.columns = ['a', 'b', 'c']
@joe-wojniak joe-wojniak changed the title Code quality guildelines Code quality guidelines Jan 26, 2021
@alexpryszlakh alexpryszlakh reopened this Jan 26, 2021
@alexpryszlakh alexpryszlakh reopened this Jan 26, 2021
@emican
Copy link
Contributor

emican commented Jan 29, 2021

What do you think of another convention: separating the presentation layer (graphs) from modules with calculations? price_crossover.py is an example of this convention and imports rsi and bollinger indicators, _plot is an internal function and the main entry point is where the a graph is created. I struggled to find a way to add tests to bollinger_2.py because graphs are created in main.

@chrisconlan
Copy link
Member Author

That makes sense. I would also like to generalize the charts around the “overlays” vs “indicators” concept.

@emican
Copy link
Contributor

emican commented Jan 30, 2021

Please allow me to propose this convention for consideration: single purpose modules.

Need:
Allow people to focus on the algorithm iterations when reading modules.

Examples:
Simple moving average was mixed in with cumulative moving averages in cumulative_moving_average.py (formally testma.py)

cumulative_moving_average.py (formally testma.py) has logic to load validation data, we may want to add a new module to utils/data_validation.py and relocate the logic

@joe-wojniak
Copy link
Contributor

joe-wojniak commented Jan 30, 2021 via email

@joe-wojniak
Copy link
Contributor

joe-wojniak commented Jan 30, 2021 via email

@emican
Copy link
Contributor

emican commented Jan 30, 2021

No worries, I can put the changes in one commit so we can easily revert. Next week at the office will be busy so I'm trying to do as much as possible now.

@chrisconlan
Copy link
Member Author

Good idea Eric

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants