Generative Deep Learning, 2nd Edition
X_T should have zero mean and unit variance
生成式AI的崛起與發展
生成式AI(Generative AI)在2022年底隨著ChatGPT的問世而廣受關注。ChatGPT展現出近乎自然的人類對話能力,讓許多人驚嘆AI技術的進展。如今,ChatGPT的功能已不僅限於文字對話,更可解讀圖片、PDF文件分析,並能提供內容摘要與深度分析。市場上類似的工具還包括Microsoft Copilot、Claude和Notebook LLM等。除了文字生成,AI技術也延伸至音樂創作和圖像生成領域。 在圖像生成領域已有非常成熟的平台,例如 Stable Diffusion、Midjourney和DALL-E等。
判別式AI與生成式AI的本質差異
相較於2019年主流的判別式AI(Discriminative AI)——專注於圖像分類或文本分類等任務,生成式AI面對的是更具挑戰性的問題。判別式AI主要解決P(y=k|x)的問題,即在已知條件x下,預測標籤y為k的機率,而無需了解x的整體分布。
生成式AI則致力於估計P(x),即從觀察到的樣本x1, x2, x3...中推測整體的機率分布。這個任務的複雜度遠超過判別式AI。理解P(x)分布的重要性在於:如果我們能找到一個近似分布Q(x),使其接近真實分布P(x),那麼從Q(x)中採樣得到的新樣本x將與真實數據具有相似的特徵。
舉例來說,在人臉生成的應用中,即使生成的面孔並不存在於原始訓練數據集中,但由於其符合真實人臉的分布特徵,因此看起來自然且真實,難以與真實人臉區分。這正是生成式AI的強大之處。
2023
偏度 k3 的計算公式:
# -*- coding: UTF-8 -*- | |
""" | |
https://blog.ittraining.com.tw/2024/10/normal-distribution.html | |
""" | |
import numpy as np | |
import matplotlib.pyplot as plt | |
#from matplotlib.mlab import bivariate_normal | |
def bivariate_normal(X, Y, sigmax=1.0, sigmay=1.0, | |
mux=0.0, muy=0.0, sigmaxy=0.0): | |
""" | |
Bivariate Gaussian distribution for equal shape *X*, *Y*. | |
See `bivariate normal | |
<http://mathworld.wolfram.com/BivariateNormalDistribution.html>`_ | |
at mathworld. | |
""" | |
Xmu = X-mux | |
Ymu = Y-muy | |
rho = sigmaxy/(sigmax*sigmay) | |
z = Xmu**2/sigmax**2 + Ymu**2/sigmay**2 - 2*rho*Xmu*Ymu/(sigmax*sigmay) | |
denom = 2*np.pi*sigmax*sigmay*np.sqrt(1-rho**2) | |
return np.exp(-z/(2*(1-rho**2))) / denom | |
def generateData(n, mean, cov): | |
""" | |
generate normal distibution data | |
""" | |
np.random.seed(2033) | |
data = np.random.multivariate_normal(mean, cov, size=n) | |
return data | |
def drawData(ax, mu, cov): | |
data = generateData(150, mu, cov) | |
ax.scatter(data[:, 0], data[:, 1]) | |
x = np.arange(-10, 10, 0.1) | |
X, Y = np.meshgrid(x, x) # 依據 meshgird(x,x)產生的點, 共有x*x個點, | |
# 傳回這些點的X及Y, 故 X's shape (x,x), Y's shape(x,x) | |
Z = bivariate_normal(X, Y, cov[0, 0], cov[1, 1], mu[0], mu[1], cov[0, 1]) | |
ax.contour(X, Y, Z) | |
ax.set_xlim([-10, 10]) | |
ax.set_ylim([-10, 10]) | |
ax.get_yaxis().set_visible(False) | |
ax.get_xaxis().set_visible(False) | |
def visualize(): | |
""" | |
visualize | |
""" | |
fig = plt.figure(figsize=(10, 10), dpi=80) | |
ax = fig.add_subplot(2, 2, 1) | |
cov = np.array([[1., 0.], | |
[0., 1.]]) | |
mu = np.array([0., 0.]) | |
drawData(ax, mu, cov) | |
ax = fig.add_subplot(2, 2, 2) | |
cov = np.array([[4., 0.], | |
[0., 4.]]) | |
mu = np.array([0., 0.]) | |
drawData(ax, mu, cov) | |
ax = fig.add_subplot(2, 2, 3) | |
cov = np.array([[4., 3.], | |
[3., 4.]]) | |
mu = np.array([0., 0.]) | |
drawData(ax, mu, cov) | |
ax = fig.add_subplot(2, 2, 4) | |
cov = np.array([[4., -3.], | |
[-3., 4.]]) | |
mu = np.array([0., 0.]) | |
drawData(ax, mu, cov) | |
plt.show() | |
if __name__ == "__main__": | |
visualize() | |
- 趨勢線: 用原始資料的移動平均值(MA)作為長期走勢的觀察. (移動窗口可以自己設定)
- 週期性 (Seaonality): 將原始資料減去趨勢線可以強調週期性的特徵
- 雜訊(Residual) : 原始資料減去趨勢線再減去週期性線, 剩下的值認定是噪訊(隨機波動)或短期的變化。
import os | |
import pandas as pd | |
import numpy as np | |
import matplotlib.pyplot as plt | |
import statsmodels.api as sm | |
def process_stock_data(df): | |
# Copy the DataFrame to avoid modifying the original | |
processed_df = df.copy() | |
# Define columns to drop | |
cols_to_drop = ['Open', 'High', 'Low', 'Volume', 'Dividends', 'Stock Splits'] | |
# Drop specified columns | |
processed_df.drop(cols_to_drop, axis=1, inplace=True) | |
# Convert 'Close' column to numeric | |
processed_df['Close'] = pd.to_numeric(processed_df['Close'], errors='coerce') | |
# Drop rows with missing 'Close' values, if any | |
processed_df = processed_df.dropna(subset=['Close']) | |
return processed_df | |
filtered_dataframe = process_stock_data(df) | |
filtered_dataframe = filtered_dataframe.groupby('Date')['Close'].sum().reset_index() | |
filtered_dataframe.head() | |
# Data decomposition | |
from pylab import rcParams | |
rcParams['figure.figsize'] = 18, 8 | |
# Convert 'Date' column to datetime if not already in datetime format | |
filtered_dataframe['Date'] = pd.to_datetime(filtered_dataframe['Date']) | |
# Set 'Date' column as the index | |
filtered_dataframe.set_index('Date', inplace=True) | |
# Now perform seasonal decomposition | |
# Now perform seasonal decomposition | |
decomposition = sm.tsa.seasonal_decompose(filtered_dataframe['Close'], model='additive', period=30) # Assuming monthly data | |
trend = decomposition.trend | |
seasonal = decomposition.seasonal | |
residual = decomposition.resid | |
# Plot the decomposed series | |
plt.figure(figsize=(18, 8)) | |
plt.subplot(411) | |
plt.plot(filtered_dataframe.index, filtered_dataframe['Close'], label='Original') | |
plt.legend(loc='best') | |
plt.subplot(412) | |
plt.plot(filtered_dataframe.index, trend, label='Trend') | |
plt.legend(loc='best') | |
plt.subplot(413) | |
plt.plot(filtered_dataframe.index, seasonal, label='Seasonal') | |
plt.legend(loc='best') | |
plt.subplot(414) | |
plt.plot(filtered_dataframe.index, residual, label='Residual') | |
plt.legend(loc='best') | |
plt.tight_layout() | |
plt.show() |
ACF圖( X軸為不同的lag k, Y軸為ACF值)
底下ACF圖說明序列資料沒有存在週期性.
ACF圖( X軸為不同的lag k, Y軸為ACF值)
底下ACF圖說明序列資料存在一個週期性, 在Lag n*K 處都有差不多的峰值, 表示其週期性為 k
A botnet is a network of compromised computers or devices, often referred to as "bots" or "zombies," that are controlled remotely by a malicious actor (known as a "botmaster"). These devices are typically infected with malware, allowing the botmaster to execute various commands on them without the device owner’s knowledge.
Here are some common uses and dangers of botnets:
Distributed Denial of Service (DDoS) Attacks: Botnets are often used to flood a target server or website with traffic, overwhelming it and causing it to crash or become unavailable to users.
Spam Distribution: They can be used to send out massive amounts of spam emails or phishing messages, which can lead to further infections or fraud.
Credential Stuffing: Botnets may attempt to use stolen usernames and passwords on different sites, automating the process to try many combinations quickly.