Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cumulative #313

Merged
merged 3 commits into from
May 3, 2023
Merged

Cumulative #313

merged 3 commits into from
May 3, 2023

Conversation

Edoardo-Pedicillo
Copy link
Contributor

@Edoardo-Pedicillo Edoardo-Pedicillo commented Apr 26, 2023

This PR implements and benchmarks a new function to evaluate the cumulative distribution.
To compare the old method (method1) with the new one (method2), you can use the following code

import time

import numpy as np
import matplotlib.pyplot as plt

REPETITIONS = 100

def cum_method1(real_values_combined, real_values_state0):
    return  [
            sum(map(lambda x: x.real >= real_value, real_values_state0))
            for real_value in real_values_combined
        ]

def cum_method2(input_data, points):
    # data and points sorted
    input_data = np.sort(input_data)
    points = np.sort(points)
 
    prob = []
    app = 0
    for val in input_data:
        app += np.max(np.searchsorted(points[app::], val), 0)
        prob.append(app)

    return np.array(prob)

def time_compare(method, input1, input2):
    start = time.time()
    for _ in range(REPETITIONS):
       cum =  method(input1, input2)
    end = time.time()
    return (end-start)/REPETITIONS, cum

state0_center = [0.0, 0.0]
state1_center = [2.0, 2.0]
cov = 0.1 * np.eye(2)
size = 100  # print(points, input_data)

data_0 = np.random.multivariate_normal(state0_center, cov, size=size)
data_1 = np.random.multivariate_normal(state1_center, cov, size=size)

all_data = np.concatenate((data_0, data_1), axis = 0)

#method1
time1, cum1 = time_compare(cum_method1,all_data[:,0],data_0[:,0])
print("Method 1: ", time1)

#method2
time2, cum2 = time_compare(cum_method2,all_data[:,0],data_0[:,0])
print("Method 2: ", time2)

#Plots
plt.figure(figsize=(14, 7), dpi=80)
plt.subplot(1,2,1)
plt.scatter(data_0[:, 0], data_0[:, 1], label = "0")
plt.scatter(data_1[:, 0], data_1[:, 1], label = "1")

plt.subplot(1,2,2)
plt.scatter(all_data[:,0],cum1, label = f"method1 ({round(time1, 4)})")
plt.scatter(np.sort(all_data[:, 0]), cum2, label = f"method2 ({round(time2,4)})")
plt.legend()
plt.show()

From the outputs we can see:

  1. the new method is four time faster
  2. the old method has a bug ( we have to change the condition x.real >= real_value in x.real <= real_value)
Method 1:  0.0026085638999938966
Method 2:  0.0008060479164123535

image
With the change proposed in point 2 this is the result
image

Checklist:

  • Reviewers confirm new code works as expected.
  • Tests are passing.
  • Coverage does not decrease.
  • Documentation is updated.

@codecov
Copy link

codecov bot commented Apr 26, 2023

Codecov Report

Merging #313 (c9a4e3a) into main (159c5ed) will increase coverage by 0.19%.
The diff coverage is 81.81%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #313      +/-   ##
==========================================
+ Coverage   37.85%   38.05%   +0.19%     
==========================================
  Files          42       42              
  Lines        2800     2809       +9     
==========================================
+ Hits         1060     1069       +9     
  Misses       1740     1740              
Flag Coverage Δ
unittests 38.05% <81.81%> (+0.19%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/qibocal/fitting/methods.py 71.92% <0.00%> (ø)
src/qibocal/fitting/utils.py 100.00% <100.00%> (ø)

@Edoardo-Pedicillo Edoardo-Pedicillo marked this pull request as draft April 26, 2023 07:59
@Edoardo-Pedicillo Edoardo-Pedicillo marked this pull request as ready for review April 28, 2023 09:54
Base automatically changed from alvaro/fix_classification_parameters to main May 3, 2023 12:14
@andrea-pasquale
Copy link
Contributor

@Edoardo-Pedicillo could we add a quick test for the cumulative?

@Edoardo-Pedicillo
Copy link
Contributor Author

Done

@andrea-pasquale andrea-pasquale merged commit 10b566f into main May 3, 2023
@andrea-pasquale andrea-pasquale deleted the cumulative branch May 3, 2023 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants