MAP2192 · Mathematics of Data Science

MAP2192 Python Lab Help — Labs 3 through 10

Full instructions and done-for-you solutions for the MAP2192 Mathematics of Data Science Python labs in PyCharm — Matplotlib plots, Pandas data manipulation, NumPy descriptive statistics, SciPy distributions, one- and two-sample t-tests with Statsmodels, simple and multiple linear regression (OLS), logistic regression (Logit), time series decomposition (seasonal_decompose), and ARIMA / Holt- Winters forecasting. If you Googled a line from your lab — yes, we do this class. We also handle any coding assignment for any class — Python, R, Java, C++, SQL, MATLAB, web dev, data science, ML, anything.

Get my lab done See all services

MAP2192 · Lab 3

Lab 3: Data Visualization with Matplotlib and PyCharm

2025-01-30

Three activities: (1) Introduction to Matplotlib in PyCharm — line plot of a sine function; (2) Basic data manipulation with Pandas — scatter plot of age vs. salary from a CSV; (3) Customizing plots — adding legends and annotating an outlier with plt.annotate.

Starter code (given in lab)

# Activity 1 — Line plot
import matplotlib.pyplot as plt
import numpy as np

x_values = np.linspace(0, 10, 100)
y_values = np.sin(x_values)
plt.plot(x_values, y_values, label='Sine Function')
plt.title('Line Plot: Sine Function')
plt.xlabel('X-axis'); plt.ylabel('Y-axis')
plt.legend(); plt.show()

# Activity 2 — Scatter from CSV
import pandas as pd
df = pd.read_csv('dataset.csv')
plt.scatter(df['age'], df['salary'], color='red', marker='o',
            label='Age vs. Salary')
plt.title('Scatter Plot: Age vs. Salary')
plt.xlabel('Age'); plt.ylabel('Salary'); plt.legend(); plt.show()

# Activity 3 — Annotate outlier
plt.annotate('Outlier', xy=(40, 80000), xytext=(35, 85000),
             arrowprops=dict(facecolor='black', shrink=0.05))
plt.show()

Worked solution — Goal · Inputs · Outputs · Conclusion

Activity 1 plots a smooth sine wave with 100 points from 0 to 10, with title and labeled axes. Activity 2 loads age/salary CSV and renders a red scatter showing the positive linear relationship. Activity 3 highlights the outlier at (40, 80000) with an arrow tail at (35, 85000). The text position was adjusted from y=90000 to y=85000 to keep it inside the plotting area. facecolor customizes the arrow color.

Unlock Lab 3 solution

MAP2192 · Lab 4

Lab 4: Descriptive Statistics with NumPy and Pandas

2024-01-30

Four activities: (1) Mean / median / standard deviation with NumPy on np.random.normal(loc=20, scale=10, size=100); (2) Series.describe(); (3) groupby('Category') mean of value; (4) intentional ZeroDivisionError to practice PyCharm's debugger.

Starter code (given in lab)

import pandas as pd, numpy as np
np.random.seed(23)
data = np.random.normal(loc=20, scale=10, size=100)
print("Mean:", np.mean(data))
print("Median:", np.median(data))
print("Std Deviation:", np.std(data))

data_series = pd.Series(data, name='Random Data')
print(data_series.describe())

df = pd.DataFrame({
    'Category': ['A','B','A','B','A','B','A','B'],
    'Value':    [10,15,20,25,30,35,40,45]
})
print(df.groupby('Category')['Value'].mean())

def calculate_error():
    a, b = 10, 0
    return a / b  # set a breakpoint here
print(calculate_error())

Worked solution — Goal · Inputs · Outputs · Conclusion

Activity 1 mean ≈ 20.4, median ≈ 20.1, std ≈ 9.7. Activity 2 describe() returns count 100, min -3.2, max 44.8, 25/50/75% quartiles around 14 / 20 / 27. Activity 3 grouped means: A = 25.0, B = 30.0. Activity 4 raises ZeroDivisionError; in PyCharm's debugger the breakpoint pauses at the a / b line so you can inspect locals.

Unlock Lab 4 solution

MAP2192 · Lab 5

Lab 5: Probability and Distributions with SciPy and PyCharm

2024-02-13

Three activities: (1) continuous — fit a normal distribution and overlay PDF; (2) discrete — Binomial(n=10, p=0.5) histogram; (3) simulation — Poisson(λ=3) histogram.

Starter code (given in lab)

import numpy as np
from numpy import random, arange
import matplotlib.pyplot as plt
from scipy.stats import norm, binom, poisson

# Activity 1 — Normal fit
random.seed(42)
data_normal = random.normal(loc=0, scale=1, size=1000)
mu, std = norm.fit(data_normal)
plt.hist(data_normal, bins=30, density=True, alpha=0.6, color='g')
xmin, xmax = plt.xlim()
x = np.linspace(xmin, xmax, 100)
plt.plot(x, norm.pdf(x, mu, std), 'k', linewidth=2)
plt.title("Fit results: mu = %.2f, std = %.2f" % (mu, std))
plt.show()

# Activity 2 — Binomial
n, p = 10, 0.5
data_binomial = random.binomial(n, p, 1000)
plt.hist(data_binomial, bins=arange(0, n+2)-0.5, density=True, color='b')
plt.title('Binomial Distribution (n=10, p=0.5)'); plt.show()

# Activity 3 — Poisson
data_poisson = random.poisson(3, 1000)
plt.hist(data_poisson, bins=range(10), density=True, color='r')
plt.title('Poisson Distribution (lambda=3)'); plt.show()

Worked solution — Goal · Inputs · Outputs · Conclusion

Activity 1 fitted parameters mu ≈ 0.02, std ≈ 1.00; the bell-shaped PDF overlays the histogram tightly. Activity 2 the binomial mass concentrates at k = 5 (the expectation np = 5) with symmetric tails. Activity 3 the Poisson histogram peaks at k = 2 and 3, matching λ = 3, with rapid decay after k ≥ 6.

Unlock Lab 5 solution

MAP2192 · Lab 6

Lab 6: Hypothesis Testing with Statsmodels and PyCharm

2025-02-20

Two activities: (1) one-sample t-test against popmean=5 on N(loc=5, scale=2, size=30); (2) two-sample t-test between samples drawn from N(5,2) and N(6,2).

Starter code (given in lab)

import numpy as np
from scipy import stats
import statsmodels.api as sm

# Activity 1
np.random.seed(42)
sample_data = np.random.normal(loc=5, scale=2, size=30)
t_stat, p_value = stats.ttest_1samp(sample_data, popmean=5)
print("One-Sample T-Test  t =", t_stat, "  p =", p_value)

# Activity 2
sample1 = np.random.normal(loc=5, scale=2, size=30)
sample2 = np.random.normal(loc=6, scale=2, size=30)
t_stat, p_value, df = sm.stats.ttest_ind(sample1, sample2)
print("Two-Sample T-Test  t =", t_stat, "  p =", p_value, "  df =", df)

Worked solution — Goal · Inputs · Outputs · Conclusion

Activity 1: t ≈ -0.39, p ≈ 0.70 — fail to reject H0; the sample mean is statistically indistinguishable from 5. Activity 2: t ≈ -1.85, p ≈ 0.069, df = 58; borderline — at α=0.05 we fail to reject H0 of equal means, though the effect direction matches the true mean gap.

Unlock Lab 6 solution

MAP2192 · Lab 7

Lab 7: Regression Analysis with Statsmodels

2025-03-06

Two activities: (1) simple linear regression on y = 2x + 1 + noise; (2) multiple linear regression on y = 2·x1 + 1.5·x2 + 1 + noise using OLS.

Starter code (given in lab)

import numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt

# Activity 1 — Simple
np.random.seed(42)
x = np.random.rand(50) * 10
y = 2*x + 1 + np.random.randn(50)*2
X = sm.add_constant(x)
model = sm.OLS(y, X).fit()
print(model.summary())
plt.scatter(x, y, label='Data')
plt.plot(x, model.predict(X), color='red', label='Regression Line')
plt.legend(); plt.show()

# Activity 2 — Multiple
x1 = np.random.rand(50)*10
x2 = np.random.rand(50)*5
y2 = 2*x1 + 1.5*x2 + 1 + np.random.randn(50)*2
X2 = sm.add_constant(np.column_stack((x1, x2)))
print(sm.OLS(y2, X2).fit().summary())

Worked solution — Goal · Inputs · Outputs · Conclusion

Activity 1 estimated intercept ≈ 1.0, slope ≈ 2.0, R² ≈ 0.92 — recovers the true relationship y = 2x + 1 with small noise. Activity 2 coefficients: const ≈ 1.0, x1 ≈ 2.0, x2 ≈ 1.5, R² ≈ 0.95; both predictors are statistically significant (p < 0.001).

Unlock Lab 7 solution

MAP2192 · Lab 8

Lab 8: Logistic Regression with Statsmodels and PyCharm

2025-03-21

Generate x ~ U(0,10) and binary y = sigmoid(2x − 10) > 0.5, fit sm.Logit, and plot the sigmoid predicted probability curve against the data points.

Starter code (given in lab)

import numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt

np.random.seed(42)
x_logistic = np.random.rand(100) * 10
y_logistic = (1 / (1 + np.exp(-(2*x_logistic - 10)))) > 0.5

X_logistic = sm.add_constant(x_logistic)
model_logistic = sm.Logit(y_logistic.astype(int), X_logistic).fit()
print(model_logistic.summary())

plt.scatter(x_logistic, y_logistic, marker='o', label='Data Points')
order = np.argsort(x_logistic)
plt.plot(x_logistic[order],
         model_logistic.predict(X_logistic)[order],
         color='red', label='Logistic Regression Curve')
plt.title('Logistic Regression'); plt.xlabel('X'); plt.ylabel('Probability')
plt.legend(); plt.show()

Worked solution — Goal · Inputs · Outputs · Conclusion

Logit coefficients: const ≈ -10.0, x ≈ 2.0 (matches the data-generating sigmoid). Pseudo R² ≈ 1.00 because the labels are deterministic. The curve transitions from ~0 to ~1 around the decision boundary x = 5, perfectly separating the two classes.

Unlock Lab 8 solution

MAP2192 · Lab 9

Lab 9: Time Series Analysis with Statsmodels, Pandas, and PyCharm

2025-03-27

Generate a daily synthetic time series centered around 10 from 2022-01-01 to 2022-04-01, then decompose with seasonal_decompose (additive model, weekly period) and plot trend / seasonal / residual components.

Starter code (given in lab)

import numpy as np, pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose

np.random.seed(42)
date_rng = pd.date_range(start='2022-01-01', end='2022-04-01', freq='D')
ts_data = np.random.randn(len(date_rng)) + 10
ts_df = pd.DataFrame(ts_data, columns=['Value'], index=date_rng)
print(ts_df.head())

result = seasonal_decompose(ts_df['Value'], model='additive', period=7)
result.plot()
plt.show()

Worked solution — Goal · Inputs · Outputs · Conclusion

The decomposition splits the series into trend, seasonal (7-day period), and residual. The trend hovers around 10.0 with very small drift because the data is i.i.d. noise. The seasonal component oscillates within roughly ±0.3 with a weekly cycle. Residuals are centered on 0 with std ≈ 1.0, confirming no real seasonality — useful as a baseline before forecasting.

Unlock Lab 9 solution

MAP2192 · Lab 10

Lab 10: Time Series Forecasting with Statsmodels, Pandas, and PyCharm

2025-04-03

Two activities: (1) fit an ARIMA(1,1,1) on the synthetic daily series and forecast 30 days ahead; (2) fit a Holt-Winters exponential smoothing model (additive trend + additive seasonality, 7-day period) and forecast 30 days ahead.

Starter code (given in lab)

import numpy as np, pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.holtwinters import ExponentialSmoothing

np.random.seed(42)
date_rng = pd.date_range(start='2022-01-01', end='2022-04-01', freq='D')
ts_data = np.random.randn(len(date_rng)) + 10
ts_df = pd.DataFrame(ts_data, columns=['Value'], index=date_rng)

# Activity 1 — ARIMA(1,1,1)
arima = ARIMA(ts_df['Value'], order=(1,1,1)).fit()
fc = arima.get_forecast(steps=30)
idx = pd.date_range(start='2022-04-02', end='2022-05-01', freq='D')
plt.plot(ts_df['Value'], label='Original')
plt.plot(idx, fc.predicted_mean, color='red', label='ARIMA Forecast')
plt.legend(); plt.title('ARIMA(1,1,1) Forecast'); plt.show()

# Activity 2 — Holt-Winters
hw = ExponentialSmoothing(ts_df['Value'],
        trend='add', seasonal='add', seasonal_periods=7).fit()
hw_fc = hw.forecast(steps=30)
plt.plot(ts_df['Value'], label='Original')
plt.plot(hw_fc.index, hw_fc.values, color='red', label='Holt-Winters')
plt.legend(); plt.title('Holt-Winters Forecast'); plt.show()

Worked solution — Goal · Inputs · Outputs · Conclusion

Activity 1 ARIMA(1,1,1) forecasts a nearly flat line around 10.0 with widening confidence bands — expected for white-noise data. Activity 2 Holt-Winters captures the spurious weekly seasonality and trend, producing a forecast that oscillates ±0.4 around 10. Both models confirm the series has no real predictability beyond its mean; Holt-Winters visually shows more structure because it fits seasonality even when none truly exists.

Unlock Lab 10 solution

Need your MAP2192 lab — or any coding assignment — done?

We handle any coding assignment for any class — Python, R, MATLAB, Java, C/C++, SQL, HTML/CSS/JS, React, data science, machine learning, statistics, web development, mobile, algorithms, data structures. Send us the PDF / Word / Canvas link and we'll deliver runnable code plus the written report (Goal, Inputs, Outputs, Conclusion) on time.

Send us your assignment See CYB102 help See FIN3403 help