Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

写了一个Python脚本,简单做一下数据分析,有需要的可以跟进自己需求调整使用 #23

Open
DanielZhao1990 opened this issue Oct 21, 2023 · 0 comments

Comments

@DanielZhao1990
Copy link

# Import required libraries
import re
from collections import defaultdict
import matplotlib.pyplot as plt
import matplotlib
matplotlib.use("TkAgg")
import pandas as pd

# Read the file with UTF-16 encoding
with open('D:\GreenSoft\KMCounter.ini', 'r', encoding='utf-16') as file:
    file_content = file.readlines()

# Initialize a dictionary to store the data
data_dict = defaultdict(lambda: {'lbcount': 0, 'rbcount': 0, 'wheel': 0, 'move': 0, 'keystrokes': 0})

# Regular expression pattern to extract date in the format YYYYMMDD and the relevant data
pattern = re.compile(r'\[(\d{8})\]')
key_patterns = ['lbcount', 'rbcount', 'wheel', 'move', 'keystrokes']

# Extract data
current_date = None
for line in file_content:
    # if ==[total]==, then skip
    if line.strip() == '[total]':
        current_date = None
        continue
    date_match = pattern.match(line.strip())
    if date_match:
        current_date = date_match.group(1)
        current_date = f"{current_date[:4]}-{current_date[4:6]}-{current_date[6:]}"
        continue
    if current_date:
        for key in key_patterns:
            if line.startswith(key):
                value = float(line.split('=')[1].strip())
                data_dict[current_date][key] = value

# Convert the defaultdict to a normal dict
data_dict = dict(data_dict)

# Convert the data dictionary to a DataFrame
df = pd.DataFrame.from_dict(data_dict, orient='index')
df.index = pd.to_datetime(df.index)

# Filter out rows with 'total' in the index
df_filtered = df[~df.index.astype(str).str.contains('total', case=False)]

# Normalize the data for plotting
df_filtered_normalized = (df_filtered - df_filtered.min()) / (df_filtered.max() - df_filtered.min())

# Plotting
fig, ax = plt.subplots(figsize=(15, 10))
for column in df_filtered_normalized.columns:
    df_filtered_normalized[column].plot(ax=ax, marker='o', label=column)
ax.set_xlabel('Date')
ax.set_ylabel('Normalized Value')
ax.set_title('Comparison of lbcount, rbcount, wheel, move, and keystrokes Over Time')
ax.legend()
plt.tight_layout()
plt.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant