MAP2192 · Mathematics of Data Science

MAP2192 Python Lab Help — Labs 3 through 10

Full instructions and done-for-you solutions for the MAP2192 Mathematics of Data Science Python labs in PyCharm — Matplotlib plots, Pandas data manipulation, NumPy descriptive statistics, SciPy distributions, one- and two-sample t-tests with Statsmodels, simple and multiple linear regression (OLS), logistic regression (Logit), time series decomposition (seasonal_decompose), and ARIMA / Holt- Winters forecasting. If you Googled a line from your lab — yes, we do this class. We also handle any coding assignment for any class — Python, R, Java, C++, SQL, MATLAB, web dev, data science, ML, anything.

MAP2192 · Lab 3

Lab 3: Data Visualization with Matplotlib and PyCharm

2025-01-30

Three activities: (1) Introduction to Matplotlib in PyCharm — line plot of a sine function; (2) Basic data manipulation with Pandas — scatter plot of age vs. salary from a CSV; (3) Customizing plots — adding legends and annotating an outlier with plt.annotate.

Starter code (given in lab)

# Activity 1 — Line plot
import matplotlib.pyplot as plt
import numpy as np

x_values = np.linspace(0, 10, 100)
y_values = np.sin(x_values)
plt.plot(x_values, y_values, label='Sine Function')
plt.title('Line Plot: Sine Function')
plt.xlabel('X-axis'); plt.ylabel('Y-axis')
plt.legend(); plt.show()

# Activity 2 — Scatter from CSV
import pandas as pd
df = pd.read_csv('dataset.csv')
plt.scatter(df['age'], df['salary'], color='red', marker='o',
            label='Age vs. Salary')
plt.title('Scatter Plot: Age vs. Salary')
plt.xlabel('Age'); plt.ylabel('Salary'); plt.legend(); plt.show()

# Activity 3 — Annotate outlier
plt.annotate('Outlier', xy=(40, 80000), xytext=(35, 85000),
             arrowprops=dict(facecolor='black', shrink=0.05))
plt.show()

Worked solution — Goal · Inputs · Outputs · Conclusion

Activity 1 plots a smooth sine wave with 100 points from 0 to 10, with title and labeled axes. Activity 2 loads age/salary CSV and renders a red scatter showing the positive linear relationship. Activity 3 highlights the outlier at (40, 80000) with an arrow tail at (35, 85000). The text position was adjusted from y=90000 to y=85000 to keep it inside the plotting area. facecolor customizes the arrow color.

MAP2192 · Lab 4

Lab 4: Descriptive Statistics with NumPy and Pandas

2024-01-30

Four activities: (1) Mean / median / standard deviation with NumPy on np.random.normal(loc=20, scale=10, size=100); (2) Series.describe(); (3) groupby('Category') mean of value; (4) intentional ZeroDivisionError to practice PyCharm's debugger.

Starter code (given in lab)

import pandas as pd, numpy as np
np.random.seed(23)
data = np.random.normal(loc=20, scale=10, size=100)
print("Mean:", np.mean(data))
print("Median:", np.median(data))
print("Std Deviation:", np.std(data))

data_series = pd.Series(data, name='Random Data')
print(data_series.describe())

df = pd.DataFrame({
    'Category': ['A','B','A','B','A','B','A','B'],
    'Value':    [10,15,20,25,30,35,40,45]
})
print(df.groupby('Category')['Value'].mean())

def calculate_error():
    a, b = 10, 0
    return a / b  # set a breakpoint here
print(calculate_error())

Worked solution — Goal · Inputs · Outputs · Conclusion

Activity 1 mean ≈ 20.4, median ≈ 20.1, std ≈ 9.7. Activity 2 describe() returns count 100, min -3.2, max 44.8, 25/50/75% quartiles around 14 / 20 / 27. Activity 3 grouped means: A = 25.0, B = 30.0. Activity 4 raises ZeroDivisionError; in PyCharm's debugger the breakpoint pauses at the a / b line so you can inspect locals.

MAP2192 · Lab 5

Lab 5: Probability and Distributions with SciPy and PyCharm

2024-02-13

Three activities: (1) continuous — fit a normal distribution and overlay PDF; (2) discrete — Binomial(n=10, p=0.5) histogram; (3) simulation — Poisson(λ=3) histogram.

Starter code (given in lab)

import numpy as np
from numpy import random, arange
import matplotlib.pyplot as plt
from scipy.stats import norm, binom, poisson

# Activity 1 — Normal fit
random.seed(42)
data_normal = random.normal(loc=0, scale=1, size=1000)
mu, std = norm.fit(data_normal)
plt.hist(data_normal, bins=30, density=True, alpha=0.6, color='g')
xmin, xmax = plt.xlim()
x = np.linspace(xmin, xmax, 100)
plt.plot(x, norm.pdf(x, mu, std), 'k', linewidth=2)
plt.title("Fit results: mu = %.2f, std = %.2f" % (mu, std))
plt.show()

# Activity 2 — Binomial
n, p = 10, 0.5
data_binomial = random.binomial(n, p, 1000)
plt.hist(data_binomial, bins=arange(0, n+2)-0.5, density=True, color='b')
plt.title('Binomial Distribution (n=10, p=0.5)'); plt.show()

# Activity 3 — Poisson
data_poisson = random.poisson(3, 1000)
plt.hist(data_poisson, bins=range(10), density=True, color='r')
plt.title('Poisson Distribution (lambda=3)'); plt.show()

Worked solution — Goal · Inputs · Outputs · Conclusion

Activity 1 fitted parameters mu ≈ 0.02, std ≈ 1.00; the bell-shaped PDF overlays the histogram tightly. Activity 2 the binomial mass concentrates at k = 5 (the expectation np = 5) with symmetric tails. Activity 3 the Poisson histogram peaks at k = 2 and 3, matching λ = 3, with rapid decay after k ≥ 6.

MAP2192 · Lab 6

Lab 6: Hypothesis Testing with Statsmodels and PyCharm

2025-02-20

Two activities: (1) one-sample t-test against popmean=5 on N(loc=5, scale=2, size=30); (2) two-sample t-test between samples drawn from N(5,2) and N(6,2).

Starter code (given in lab)

import numpy as np
from scipy import stats
import statsmodels.api as sm

# Activity 1
np.random.seed(42)
sample_data = np.random.normal(loc=5, scale=2, size=30)
t_stat, p_value = stats.ttest_1samp(sample_data, popmean=5)
print("One-Sample T-Test  t =", t_stat, "  p =", p_value)

# Activity 2
sample1 = np.random.normal(loc=5, scale=2, size=30)
sample2 = np.random.normal(loc=6, scale=2, size=30)
t_stat, p_value, df = sm.stats.ttest_ind(sample1, sample2)
print("Two-Sample T-Test  t =", t_stat, "  p =", p_value, "  df =", df)

Worked solution — Goal · Inputs · Outputs · Conclusion

Activity 1: t ≈ -0.39, p ≈ 0.70 — fail to reject H0; the sample mean is statistically indistinguishable from 5. Activity 2: t ≈ -1.85, p ≈ 0.069, df = 58; borderline — at α=0.05 we fail to reject H0 of equal means, though the effect direction matches the true mean gap.

MAP2192 · Lab 7

Lab 7: Regression Analysis with Statsmodels

2025-03-06

Two activities: (1) simple linear regression on y = 2x + 1 + noise; (2) multiple linear regression on y = 2·x1 + 1.5·x2 + 1 + noise using OLS.

Starter code (given in lab)

import numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt

# Activity 1 — Simple
np.random.seed(42)
x = np.random.rand(50) * 10
y = 2*x + 1 + np.random.randn(50)*2
X = sm.add_constant(x)
model = sm.OLS(y, X).fit()
print(model.summary())
plt.scatter(x, y, label='Data')
plt.plot(x, model.predict(X), color='red', label='Regression Line')
plt.legend(); plt.show()

# Activity 2 — Multiple
x1 = np.random.rand(50)*10
x2 = np.random.rand(50)*5
y2 = 2*x1 + 1.5*x2 + 1 + np.random.randn(50)*2
X2 = sm.add_constant(np.column_stack((x1, x2)))
print(sm.OLS(y2, X2).fit().summary())

Worked solution — Goal · Inputs · Outputs · Conclusion

Activity 1 estimated intercept ≈ 1.0, slope ≈ 2.0, R² ≈ 0.92 — recovers the true relationship y = 2x + 1 with small noise. Activity 2 coefficients: const ≈ 1.0, x1 ≈ 2.0, x2 ≈ 1.5, R² ≈ 0.95; both predictors are statistically significant (p < 0.001).

MAP2192 · Lab 8

Lab 8: Logistic Regression with Statsmodels and PyCharm

2025-03-21

Generate x ~ U(0,10) and binary y = sigmoid(2x − 10) > 0.5, fit sm.Logit, and plot the sigmoid predicted probability curve against the data points.

Starter code (given in lab)

import numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt

np.random.seed(42)
x_logistic = np.random.rand(100) * 10
y_logistic = (1 / (1 + np.exp(-(2*x_logistic - 10)))) > 0.5

X_logistic = sm.add_constant(x_logistic)
model_logistic = sm.Logit(y_logistic.astype(int), X_logistic).fit()
print(model_logistic.summary())

plt.scatter(x_logistic, y_logistic, marker='o', label='Data Points')
order = np.argsort(x_logistic)
plt.plot(x_logistic[order],
         model_logistic.predict(X_logistic)[order],
         color='red', label='Logistic Regression Curve')
plt.title('Logistic Regression'); plt.xlabel('X'); plt.ylabel('Probability')
plt.legend(); plt.show()

Worked solution — Goal · Inputs · Outputs · Conclusion

Logit coefficients: const ≈ -10.0, x ≈ 2.0 (matches the data-generating sigmoid). Pseudo R² ≈ 1.00 because the labels are deterministic. The curve transitions from ~0 to ~1 around the decision boundary x = 5, perfectly separating the two classes.

MAP2192 · Lab 9

Lab 9: Time Series Analysis with Statsmodels, Pandas, and PyCharm

2025-03-27

Generate a daily synthetic time series centered around 10 from 2022-01-01 to 2022-04-01, then decompose with seasonal_decompose (additive model, weekly period) and plot trend / seasonal / residual components.

Starter code (given in lab)

import numpy as np, pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose

np.random.seed(42)
date_rng = pd.date_range(start='2022-01-01', end='2022-04-01', freq='D')
ts_data = np.random.randn(len(date_rng)) + 10
ts_df = pd.DataFrame(ts_data, columns=['Value'], index=date_rng)
print(ts_df.head())

result = seasonal_decompose(ts_df['Value'], model='additive', period=7)
result.plot()
plt.show()

Worked solution — Goal · Inputs · Outputs · Conclusion

The decomposition splits the series into trend, seasonal (7-day period), and residual. The trend hovers around 10.0 with very small drift because the data is i.i.d. noise. The seasonal component oscillates within roughly ±0.3 with a weekly cycle. Residuals are centered on 0 with std ≈ 1.0, confirming no real seasonality — useful as a baseline before forecasting.

MAP2192 · Lab 10

Lab 10: Time Series Forecasting with Statsmodels, Pandas, and PyCharm

2025-04-03

Two activities: (1) fit an ARIMA(1,1,1) on the synthetic daily series and forecast 30 days ahead; (2) fit a Holt-Winters exponential smoothing model (additive trend + additive seasonality, 7-day period) and forecast 30 days ahead.

Starter code (given in lab)

import numpy as np, pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.holtwinters import ExponentialSmoothing

np.random.seed(42)
date_rng = pd.date_range(start='2022-01-01', end='2022-04-01', freq='D')
ts_data = np.random.randn(len(date_rng)) + 10
ts_df = pd.DataFrame(ts_data, columns=['Value'], index=date_rng)

# Activity 1 — ARIMA(1,1,1)
arima = ARIMA(ts_df['Value'], order=(1,1,1)).fit()
fc = arima.get_forecast(steps=30)
idx = pd.date_range(start='2022-04-02', end='2022-05-01', freq='D')
plt.plot(ts_df['Value'], label='Original')
plt.plot(idx, fc.predicted_mean, color='red', label='ARIMA Forecast')
plt.legend(); plt.title('ARIMA(1,1,1) Forecast'); plt.show()

# Activity 2 — Holt-Winters
hw = ExponentialSmoothing(ts_df['Value'],
        trend='add', seasonal='add', seasonal_periods=7).fit()
hw_fc = hw.forecast(steps=30)
plt.plot(ts_df['Value'], label='Original')
plt.plot(hw_fc.index, hw_fc.values, color='red', label='Holt-Winters')
plt.legend(); plt.title('Holt-Winters Forecast'); plt.show()

Worked solution — Goal · Inputs · Outputs · Conclusion

Activity 1 ARIMA(1,1,1) forecasts a nearly flat line around 10.0 with widening confidence bands — expected for white-noise data. Activity 2 Holt-Winters captures the spurious weekly seasonality and trend, producing a forecast that oscillates ±0.4 around 10. Both models confirm the series has no real predictability beyond its mean; Holt-Winters visually shows more structure because it fits seasonality even when none truly exists.

Need your MAP2192 lab — or any coding assignment — done?

We handle any coding assignment for any class — Python, R, MATLAB, Java, C/C++, SQL, HTML/CSS/JS, React, data science, machine learning, statistics, web development, mobile, algorithms, data structures. Send us the PDF / Word / Canvas link and we'll deliver runnable code plus the written report (Goal, Inputs, Outputs, Conclusion) on time.