注意
前往結尾下載完整的範例程式碼。或透過 Binder 在您的瀏覽器中執行此範例
Fisher 向量特徵編碼#
Fisher 向量是一種影像特徵編碼和量化技術,可以被視為流行的詞袋 (bag-of-visual-words) 或 VLAD 演算法的軟性或機率版本。影像使用視覺詞彙進行建模,該視覺詞彙是使用在低層影像特徵 (例如 SIFT 或 ORB 描述子) 上訓練的 K 模式高斯混合模型估計的。Fisher 向量本身是高斯混合模型 (GMM) 相對於其參數 (混合權重、均值和共變異數矩陣) 的梯度串聯。
在此範例中,我們計算 scikit-learn 中數字資料集的 Fisher 向量,並在這些表示上訓練分類器。
請注意,執行此範例需要 scikit-learn。

precision recall f1-score support
0 0.89 0.92 0.90 51
1 0.67 0.82 0.73 44
2 0.61 0.55 0.58 40
3 0.63 0.51 0.56 53
4 0.75 0.60 0.67 45
5 0.52 0.70 0.60 40
6 0.50 0.48 0.49 46
7 0.48 0.64 0.55 39
8 0.55 0.50 0.53 42
9 0.62 0.50 0.56 50
accuracy 0.62 450
macro avg 0.62 0.62 0.62 450
weighted avg 0.63 0.62 0.62 450
from matplotlib import pyplot as plt
import numpy as np
from sklearn.datasets import load_digits
from sklearn.metrics import classification_report, ConfusionMatrixDisplay
from sklearn.model_selection import train_test_split
from sklearn.svm import LinearSVC
from skimage.transform import resize
from skimage.feature import fisher_vector, ORB, learn_gmm
data = load_digits()
images = data.images
targets = data.target
# Resize images so that ORB detects interest points for all images
images = np.array([resize(image, (80, 80)) for image in images])
# Compute ORB descriptors for each image
descriptors = []
for image in images:
detector_extractor = ORB(n_keypoints=5, harris_k=0.01)
detector_extractor.detect_and_extract(image)
descriptors.append(detector_extractor.descriptors.astype('float32'))
# Split the data into training and testing subsets
train_descriptors, test_descriptors, train_targets, test_targets = train_test_split(
descriptors, targets
)
# Train a K-mode GMM
k = 16
gmm = learn_gmm(train_descriptors, n_modes=k)
# Compute the Fisher vectors
training_fvs = np.array(
[fisher_vector(descriptor_mat, gmm) for descriptor_mat in train_descriptors]
)
testing_fvs = np.array(
[fisher_vector(descriptor_mat, gmm) for descriptor_mat in test_descriptors]
)
svm = LinearSVC().fit(training_fvs, train_targets)
predictions = svm.predict(testing_fvs)
print(classification_report(test_targets, predictions))
ConfusionMatrixDisplay.from_estimator(
svm,
testing_fvs,
test_targets,
cmap=plt.cm.Blues,
)
plt.show()
腳本總執行時間: (0 分鐘 33.406 秒)