MAGAZINE
ルーターマガジン
Google Vision API を使った類似画像判定
エンジニアの kanazawa です。久々の機械学習ネタということで、類似画像の判定を試してみました。類似画像の判定には、古典的な画像分析で頑張る方法もあるかと思いますが、今回はエイヤで Google Vision API に投げて、その結果の類似度で類似画像判定できないかというアイデアを試してみました。
画像の特徴量の作り方
Google Vision API の LABEL DETECTION の機能を利用して、画像に対して ラベルの付与を行います。
-
サンプル画像
-
Google Vision API の結果
{ "responses": [ { "labelAnnotations": [ { "mid": "/m/0c9ph5", "score": 0.97519124, "topicality": 0.97519124, "description": "Flower" }, { "mid": "/m/05s2s", "score": 0.9259831, "topicality": 0.9259831, "description": "Plant" }, { "mid": "/m/07j7r", "score": 0.88428324, "topicality": 0.88428324, "description": "Tree" }, { "mid": "/m/016nqt", "score": 0.88323385, "topicality": 0.88323385, "description": "Twig" }, { "mid": "/m/0b5gs", "score": 0.8688874, "topicality": 0.8688874, "description": "Branch" }, { "mid": "/m/016q19", "score": 0.84059125, "topicality": 0.84059125, "description": "Petal" }, { "mid": "/m/01fklc", "score": 0.82767916, "topicality": 0.82767916, "description": "Pink" }, { "mid": "/m/02tcwp", "score": 0.7772147, "topicality": 0.7772147, "description": "Trunk" }, { "mid": "/m/02q_bfg", "score": 0.760938, "topicality": 0.760938, "description": "Tints and shades" }, { "mid": "/m/04sjm", "score": 0.74587005, "topicality": 0.74587005, "description": "Flowering plant" }, { "mid": "/m/0j7ty", "score": 0.7320647, "topicality": 0.7320647, "description": "Blossom" }, { "mid": "/m/083vt", "score": 0.7303314, "topicality": 0.7303314, "description": "Wood" }, { "mid": "/m/018ssc", "score": 0.718404, "topicality": 0.718404, "description": "Groundcover" }, { "mid": "/m/0hlzt", "score": 0.70114774, "topicality": 0.70114774, "description": "Deciduous" }, { "mid": "/m/08t9c_", "score": 0.6235262, "topicality": 0.6235262, "description": "Grass" }, { "mid": "/m/01ttd6", "score": 0.5967256, "topicality": 0.5967256, "description": "Plant stem" }, { "mid": "/m/0gqbt", "score": 0.5924945, "topicality": 0.5924945, "description": "Shrub" }, { "mid": "/m/03d28y3", "score": 0.5875379, "topicality": 0.5875379, "description": "Natural landscape" }, { "mid": "/m/01qd72", "score": 0.57774043, "topicality": 0.57774043, "description": "Malus" }, { "mid": "/m/0ckc5", "score": 0.55246717, "topicality": 0.55246717, "description": "Magenta" }, { "mid": "/m/0hs32", "score": 0.54021865, "topicality": 0.54021865, "description": "Prunus" }, { "mid": "/m/01wh_8", "score": 0.5165784, "topicality": 0.5165784, "description": "Cherry blossom" } ] } ] }
-
ラベル 今回は、特異度 (topicallity) は無視して、そのままラベルを半角空白で結合します。
Flower Plant Tree Twig Branch Petal Pink Trunk Tints and shades Flowering plant Blossom Wood Groundcover Deciduous Grass Plant stem Shrub Natural landscape Malus Magenta Prunus Cherry blossom
ちゃんと花の木であることが、エンコードされていますね!!
サンプル画像
商用利用OKということで、PAKUTASO から取ってきました。 以下の10カテゴリに散らして、それぞれ 3枚ずつ、合計 30枚をテスト画像に使用しました。
さくら
男性
山
川
工場
満月
犬
鳥
自動車
お金
DB テーブルスキーマ
標準でいい感じの全文検索がかけるので、PostgreSQL でテーブルを作っています。
Google Vison API の検索を行なうようのテーブル
CREATE TABLE public.image_label_annotation_results (
id bigint NOT NULL,
status smallint DEFAULT 0,
server character varying(50) DEFAULT NULL::character varying,
image_hash text NOT NULL,
image_s3_url text NOT NULL,
api_result jsonb,
created_at timestamp(0) without time zone DEFAULT CURRENT_TIMESTAMP(0) NOT NULL,
updated_at timestamp(0) without time zone DEFAULT CURRENT_TIMESTAMP(0) NOT NULL
);
API の戻り値 (JSON) をパースして、ラベル化したもの
CREATE TABLE public.image_annotation_labels (
id bigint NOT NULL,
image_hash text NOT NULL,
label character varying(1000) DEFAULT NULL::character varying,
label_search tsvector,
created_at timestamp(0) without time zone DEFAULT CURRENT_TIMESTAMP(0) NOT NULL,
updated_at timestamp(0) without time zone DEFAULT CURRENT_TIMESTAMP(0) NOT NULL
);
スクリプト
以下の SQL クエリで、API に投げたい画像をDBに登録しておく。
INSERT INTO image_label_annotation_results (image_hash, image_s3_url) VALUES
('test_001', 'https://example.com/test_001.jpg'),
('test_002', 'https://example.com/test_002.jpg'),
('test_003', 'https://example.com/test_003.jpg'),
('test_004', 'https://example.com/test_004.jpg');
※ image_hash は 画像を一意に特定するためのキー
次に、 次のスクリプトで、Google Vision API を叩く。
require 'net/http'
require 'open-uri'
require 'json'
require 'active_record'
require 'pg'
ActiveRecord::Base.establish_connection(
adapter: 'postgresql',
host: 127.0.0.1,
username: 'postgres',
password: 'postgres!',
database: 'image_annotation',
schema_search_path: 'public',
reconnect: true
)
ActiveRecord.default_timezone = :local
Time.zone_default = Time.find_zone! 'Tokyo'
class ImageLabelAnnotationResult < ActiveRecord::Base
has_one :image_annotation_label, primary_key: :image_hash, foreign_key: :image_hash
end
def request_label_api(image_url:)
## gcloud CLI から アクセストークンを生成
access_token = `gcloud auth application-default print-access-token`
uri = URI('https://vision.googleapis.com/v1/images:annotate')
body = {
"requests": [
{
"features": [
{
"maxResults": 50,
"type": 'LABEL_DETECTION'
}
],
"image": { "source": { "imageUri": "#{image_url}" } }
}
]
}.to_json(indent: false)
headers = {
'Authorization': "Bearer #{access_token}",
'Content-Type': 'application/json'
}
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
request = Net::HTTP::Post.new(uri.request_uri, headers)
request.body = body
response = http.request(request)
JSON.parse(response.body)
end
def create_vision_api_results(request_record)
request_record.update(status: 1)
json = request_label_api(image_url: request_record.image_s3_url)
request_record.update(status: 2, server: SERVER, api_result: json)
end
if __FILE__ == $0
loop do
request_id = ImageLabelAnnotationResult.where(status: 0).limit(50).pluck(:id).sample
return unless request_id
request_record = ImageLabelAnnotationResult.find(request_id)
create_vision_api_results(request_record)
end
end
最後に、次のスクリプトで API結果のJSONをパースして、全文検索用のテーブルを作成する。
require 'json'
require 'active_record'
require 'pg'
ActiveRecord::Base.establish_connection(
adapter: 'postgresql',
host: 127.0.0.1,
username: 'postgres',
password: 'postgres!',
database: 'image_annotation',
schema_search_path: 'public',
reconnect: true
)
ActiveRecord.default_timezone = :local
Time.zone_default = Time.find_zone! 'Tokyo'
class ImageLabelAnnotationResult < ActiveRecord::Base
has_one :image_annotation_label, primary_key: :image_hash, foreign_key: :image_hash
end
class ImageAnnotationLabel < ActiveRecord::Base
has_one :image_label_annotation_result, primary_key: :image_hash, foreign_key: :image_hash
end
def parse_vision_api_result(result_record)
annotation = result_record.api_result.dig('responses', 0, 'labelAnnotations')
return unless annotation
label_str = annotation.map do |label_hash|
label_hash['description']
end.join(' ')
rec = ImageAnnotationLabel.create({ image_hash: result_record.image_hash, label: label_str })
ImageAnnotationLabel.where(id: rec.id).update_all("label_search = to_tsvector('english', label)")
end
if __FILE__ == $0
ImageLabelAnnotationResult.where(status: 2).order(:id).pluck(:id).each do |id|
result_record = ImageLabelAnnotationResult.find(id)
parse_vision_api_result(result_record)
end
end
実験
以下のクエリで、ラベル同士を比較して、その結果の上位3位を類似画像とします。 全文検索をするために、検索元画像のラベルは一旦 OR (|) で繋ぎます。
例1 <さくら>
-
検索元画像
-
検索元画像のラベル
Flower Plant Tree Twig Branch Petal Pink Trunk Tints and shades Flowering plant Blossom Wood Groundcover Deciduous Grass Plant stem Shrub Natural landscape Malus Magenta Prunus Cherry blossom
-
クエリ
select ts_rank_cd(label_search, to_tsquery('Flower|Plant|Tree|Twig|Branch|Petal|Pink|Trunk|Tints|and|shades|Flowering|plant|Blossom|Wood|Groundcover|Deciduous|Grass|Plant|stem|Shrub|Natural|landscape|Malus|Magenta|Prunus|Cherry|blossom')) as rank, ilar.image_s3_url from image_annotation_labels ial join image_label_annotation_results ilar on ial.image_hash = ilar.image_hash order by rank desc limit 3;
-
結果
-
類似1位
Flower Plant Tree Twig Branch Petal Pink Trunk Tints and shades Flowering plant Blossom Wood Groundcover Deciduous Grass Plant stem Shrub Natural landscape Malus Magenta Prunus Cherry blossom
-
類似2位
Plant Flower Botany Tree Branch Twig Natural landscape Biome Woody plant Grass Groundcover Tints and shades Shrub Blossom Flowering plant Landscape Garden Cherry blossom Trunk Landscaping Plant stem Road surface Petal City Prunus Road House Sidewalk
-
- 類似3位
Water Plant Water resources Fluvial landforms of streams Natural landscape Natural environment Branch Spring Waterfall Vegetation Watercourse Riparian zone Terrestrial plant Chute Landscape Groundcover Formation Grass Mountain river Tree Forest Non-vascular land plant Water feature Creek Stream Arroyo Temperate broadleaf and mixed forest Rapid Woodland Jungle Wildlife Mineral spring Vascular plant Rock Riparian forest Old-growth forest Tropical and subtropical coniferous forests Ravine Valdivian temperate rain forest Northern hardwood forest Stream bed Rainforest Shrub Moss Garden Algae Tributary Antarctic flora River Tropics
例2 <自動車>
-
検索元画像
-
検索元画像のラベル
Wheel Car Tire Land vehicle Vehicle Vehicle registration plate Building Automotive lighting Motor vehicle Window Infrastructure Automotive design Mode of transport Automotive exterior Plant Classic car Road City Automotive wheel system Classic Art Street Antique car Family car Parking Luxury vehicle Rolling Mid-size car Town Pedestrian Compact car Vintage car Grille Asphalt Sidewalk Personal luxury car Hardtop Coupé City car Van Sedan Facade Transport Cobblestone Bumper House Sky Traffic
-
クエリ
select ts_rank_cd(label_search, to_tsquery('Wheel|Car|Tire|Land|vehicle|Vehicle|Vehicle|registration|plate|Building|Automotive|lighting|Motor|vehicle|Window|Infrastructure|Automotive|design|Mode|of|transport|Automotive|exterior|Plant|Classic|car|Road|City|Automotive|wheel|system|Classic|Art|Street|Antique|car|Family|car|Parking|Luxury|vehicle|Rolling|Mid-size|car|Town|Pedestrian|Compact|car|Vintage|car|Grille|Asphalt|Sidewalk|Personal|luxury|car|Hardtop|Coupé|City|car|Van|Sedan|Facade|Transport|Cobblestone|Bumper|House|Sky|Traffic')) as rank, ial."label", ilar.image_s3_url from image_annotation_labels ial join image_label_annotation_results ilar on ial.image_hash = ilar.image_hash order by rank desc limit 3;
-
結果
- 類似1位
Wheel Car Tire Land vehicle Vehicle Vehicle registration plate Building Automotive lighting Motor vehicle Window Infrastructure Automotive design Mode of transport Automotive exterior Plant Classic car Road City Automotive wheel system Classic Art Street Antique car Family car Parking Luxury vehicle Rolling Mid-size car Town Pedestrian Compact car Vintage car Grille Asphalt Sidewalk Personal luxury car Hardtop Coupé City car Van Sedan Facade Transport Cobblestone Bumper House Sky Traffic
- 類似1位
- 類似2位
Car Tire Wheel Land vehicle Vehicle Vehicle registration plate Automotive lighting Hood Automotive design Motor vehicle Automotive tire Exhaust system Chevrolet corvette c6 zr1 Automotive exhaust Alloy wheel Automotive exterior Rolling Fender Personal luxury car Bumper Automotive wheel system Hardtop Spoiler Rim Asphalt Road Grand tourer Kit car Performance car Building Lotus Muffler Ferrari f430 Sky Sports car Automotive parking light Luxury vehicle Auto part Muscle car Vehicle door Automotive tail & brake light Electric blue Race car Coupé Chevrolet corvette Tree Automotive mirror Supercar Tesla Chevrolet
- 類似3位
Bus Vehicle Tire Wheel Window Motor vehicle Building Tree Mode of transport Vehicle registration plate Automotive lighting Road surface Thoroughfare Gas Electricity Metropolitan area City Public transport Road Metropolis Winter Lane Electric blue Pedestrian Street Commercial vehicle Automotive exterior Tour bus service Sidewalk Downtown Public utility Car Automotive wheel system Mixed-use Asphalt Traffic Cobblestone Transport Electrical supply
まとめ
- Gogole Vison API のラベル検出機能を使って、画像をラベルに変換 & ラベルに対する全文検索をすることで、類似画像検索を実装してみました。
- 自動車などの物理的な物体があると、ラベル変換の精度が高いのでこの方法でもいい感じに類似の画像が引っ張れそう。
- 物体がない場合は、ラベル検知も意味がないキーワードを返すので辛い
- 全文検索のロジック周りはもっと工夫しがいがある。
ちなみに、APIのお値段は 1,000枚 で $1.50 です。
CONTACT
お問い合わせ・ご依頼はこちらから