.NET做人臉識(shí)別并分類的實(shí)現(xiàn)示例

更新時(shí)間：2019年11月27日 14:33:09 作者：.NET騷操作

這篇文章主要介紹了.NET做人臉識(shí)別并分類示例，文中通過(guò)示例代碼介紹的非常詳細(xì)，對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值，需要的朋友們下面隨著小編來(lái)一起學(xué)習(xí)學(xué)習(xí)吧

在游樂(lè)場(chǎng)、玻璃天橋、滑雪場(chǎng)等娛樂(lè)場(chǎng)所，經(jīng)常能看到有攝影師在拍照片，令這些經(jīng)營(yíng)者發(fā)愁的一件事就是照片太多了，客戶在成千上萬(wàn)張照片中找到自己可不是件容易的事。在一次游玩等活動(dòng)或家庭聚會(huì)也同理，太多了照片導(dǎo)致挑選十分困難。

還好有.NET，只需少量代碼，即可輕松找到人臉并完成分類。

本文將使用Microsoft Azure云提供的認(rèn)知服務(wù)（Cognitive Services）API來(lái)識(shí)別并進(jìn)行人臉?lè)诸?，可以免費(fèi)使用，注冊(cè)地址是：https://portal.azure.com。注冊(cè)完成后，會(huì)得到兩個(gè)密鑰，通過(guò)這個(gè)密鑰即可完成本文中的所有代碼，這個(gè)密鑰長(zhǎng)這個(gè)樣子（非真實(shí)密鑰）：

fa3a7bfd807ccd6b17cf559ad584cbaa

使用方法

首先安裝NuGet包Microsoft.Azure.CognitiveServices.Vision.Face，目前最新版是2.5.0-preview.1，然后創(chuàng)建一個(gè)FaceClient：

string key = "fa3a7bfd807ccd6b17cf559ad584cbaa"; // 替換為你的key
using var fc = new FaceClient(new ApiKeyServiceClientCredentials(key))
{
  Endpoint = "https://southeastasia.api.cognitive.microsoft.com",
};

然后識(shí)別一張照片：

using var file = File.OpenRead(@"C:\Photos\DSC_996ICU.JPG");
IList<DetectedFace> faces = await fc.Face.DetectWithStreamAsync(file);

其中返回的faces是一個(gè)IList結(jié)構(gòu)，很顯然一次可以識(shí)別出多個(gè)人臉，其中一個(gè)示例返回結(jié)果如下（已轉(zhuǎn)換為JSON）：

[
  {
   "FaceId": "9997b64e-6e62-4424-88b5-f4780d3767c6",
   "RecognitionModel": null,
   "FaceRectangle": {
    "Width": 174,
    "Height": 174,
    "Left": 62,
    "Top": 559
   },
   "FaceLandmarks": null,
   "FaceAttributes": null
  },
  {
   "FaceId": "8793b251-8cc8-45c5-ab68-e7c9064c4cfd",
   "RecognitionModel": null,
   "FaceRectangle": {
    "Width": 152,
    "Height": 152,
    "Left": 775,
    "Top": 580
   },
   "FaceLandmarks": null,
   "FaceAttributes": null
  }
 ]

可見(jiàn)，該照片返回了兩個(gè)DetectedFace對(duì)象，它用FaceId保存了其Id，用于后續(xù)的識(shí)別，用FaceRectangle保存了其人臉的位置信息，可供對(duì)其做進(jìn)一步操作。RecognitionModel、FaceLandmarks、FaceAttributes是一些額外屬性，包括識(shí)別性別、年齡、表情等信息，默認(rèn)不識(shí)別，如下圖API所示，可以通過(guò)各種參數(shù)配置，非常好玩，有興趣的可以試試：

最后，通過(guò).GroupAsync來(lái)將之前識(shí)別出的多個(gè)faceId進(jìn)行分類：

var faceIds = faces.Select(x => x.FaceId.Value).ToList();
GroupResult reslut = await fc.Face.GroupAsync(faceIds);

返回了一個(gè)GroupResult，其對(duì)象定義如下：

public class GroupResult
{
  public IList<IList<Guid>> Groups
  {
    get;
    set;
  }

  public IList<Guid> MessyGroup
  {
    get;
    set;
  }

  // ...
}

包含了一個(gè)Groups對(duì)象和一個(gè)MessyGroup對(duì)象，其中Groups是一個(gè)數(shù)據(jù)的數(shù)據(jù)，用于存放人臉的分組，MessyGroup用于保存未能找到分組的FaceId。

有了這個(gè)，就可以通過(guò)一小段簡(jiǎn)短的代碼，將不同的人臉組，分別復(fù)制對(duì)應(yīng)的文件夾中：

void CopyGroup(string outputPath, GroupResult result, Dictionary<Guid, (string file, DetectedFace face)> faces)
{
  foreach (var item in result.Groups
    .SelectMany((group, index) => group.Select(v => (faceId: v, index)))
    .Select(x => (info: faces[x.faceId], i: x.index + 1)).Dump())
  {
    string dir = Path.Combine(outputPath, item.i.ToString());
    Directory.CreateDirectory(dir);
    File.Copy(item.info.file, Path.Combine(dir, Path.GetFileName(item.info.file)), overwrite: true);
  }
  
  string messyFolder = Path.Combine(outputPath, "messy");
  Directory.CreateDirectory(messyFolder);
  foreach (var file in result.MessyGroup.Select(x => faces[x].file).Distinct())
  {
    File.Copy(file, Path.Combine(messyFolder, Path.GetFileName(file)), overwrite: true);
  }
}

然后就能得到運(yùn)行結(jié)果，如圖，我傳入了102張照片，輸出了15個(gè)分組和一個(gè)“未找到隊(duì)友”的分組：

還能有什么問(wèn)題？

就兩個(gè)API調(diào)用而已，代碼一把梭，感覺(jué)太簡(jiǎn)單了？其實(shí)不然，還會(huì)有很多問(wèn)題。

圖片太大，需要壓縮

畢竟要把圖片上傳到云服務(wù)中，如果上傳網(wǎng)速不佳，流量會(huì)挺大，而且現(xiàn)在的手機(jī)、單反、微單都能輕松達(dá)到好幾千萬(wàn)像素，jpg大小輕松上10MB，如果不壓縮就上傳，一來(lái)流量和速度遭不住。

二來(lái)……其實(shí)Azure也不支持，文檔(https://docs.microsoft.com/en-us/rest/api/cognitiveservices/face/face/detectwithstream)顯示，最大僅支持6MB的圖片，且圖片大小應(yīng)不大于1920x1080的分辨率：

JPEG, PNG, GIF (the first frame), and BMP format are supported. The allowed image file size is from 1KB to 6MB.
The minimum detectable face size is 36x36 pixels in an image no larger than 1920x1080 pixels. Images with dimensions higher than 1920x1080 pixels will need a proportionally larger minimum face size.

因此，如果圖片太大，必須進(jìn)行一定的壓縮（當(dāng)然如果圖片太小，顯然也沒(méi)必要進(jìn)行壓縮了），使用.NET的Bitmap，并結(jié)合C# 8.0的switch expression，這個(gè)判斷邏輯以及壓縮代碼可以一氣呵成：

byte[] CompressImage(string image, int edgeLimit = 1920)
{
  using var bmp = Bitmap.FromFile(image);
  
  using var resized = (1.0 * Math.Max(bmp.Width, bmp.Height) / edgeLimit) switch
  {
    var x when x > 1 => new Bitmap(bmp, new Size((int)(bmp.Size.Width / x), (int)(bmp.Size.Height / x))), 
    _ => bmp, 
  };
  
  using var ms = new MemoryStream();
  resized.Save(ms, ImageFormat.Jpeg);
  return ms.ToArray();
}

豎立的照片

相機(jī)一般都是3:2的傳感器，拍出來(lái)的照片一般都是橫向的。但偶爾尋求一些構(gòu)圖的時(shí)候，我們也會(huì)選擇縱向構(gòu)圖。雖然現(xiàn)在許多API都支持正負(fù)30度的側(cè)臉，但豎著的臉API基本都是不支持的，如下圖（實(shí)在找不到可以授權(quán)使用照片的模特了😂）：

還好照片在拍攝后，都會(huì)保留exif信息，只需讀取exif信息并對(duì)照片做相應(yīng)的旋轉(zhuǎn)即可：

void HandleOrientation(Image image, PropertyItem[] propertyItems)
{
  const int exifOrientationId = 0x112;
  PropertyItem orientationProp = propertyItems.FirstOrDefault(i => i.Id == exifOrientationId);
  
  if (orientationProp == null) return;
  
  int val = BitConverter.ToUInt16(orientationProp.Value, 0);
  RotateFlipType rotateFlipType = val switch
  {
    2 => RotateFlipType.RotateNoneFlipX, 
    3 => RotateFlipType.Rotate180FlipNone, 
    4 => RotateFlipType.Rotate180FlipX, 
    5 => RotateFlipType.Rotate90FlipX, 
    6 => RotateFlipType.Rotate90FlipNone, 
    7 => RotateFlipType.Rotate270FlipX, 
    8 => RotateFlipType.Rotate270FlipNone, 
    _ => RotateFlipType.RotateNoneFlipNone, 
  };
  
  if (rotateFlipType != RotateFlipType.RotateNoneFlipNone)
  {
    image.RotateFlip(rotateFlipType);
  }
}

旋轉(zhuǎn)后，我的照片如下：

這樣豎拍的照片也能識(shí)別出來(lái)了。

并行速度

前文說(shuō)過(guò)，一個(gè)文件夾可能會(huì)有成千上萬(wàn)個(gè)文件，一個(gè)個(gè)上傳識(shí)別，速度可能慢了點(diǎn)，它的代碼可能長(zhǎng)這個(gè)樣子：

Dictionary<Guid, (string file, DetectedFace face)> faces = GetFiles(inFolder)
 .Select(file => 
 {
  byte[] bytes = CompressImage(file);
  var result = (file, faces: fc.Face.DetectWithStreamAsync(new MemoryStream(bytes)).GetAwaiter().GetResult());
  (result.faces.Count == 0 ? $"{file} not detect any face!!!" : $"{file} detected {result.faces.Count}.").Dump();
  return (file, faces: result.faces.ToList());
 })
 .SelectMany(x => x.faces.Select(face => (x.file, face)))
 .ToDictionary(x => x.face.FaceId.Value, x => (file: x.file, face: x.face));

要想把速度變化，可以啟用并行上傳，有了C#/.NET的LINQ支持，只需加一行.AsParallel()即可完成：

Dictionary<Guid, (string file, DetectedFace face)> faces = GetFiles(inFolder)
 .AsParallel() // 加的就是這行代碼
 .Select(file => 
 {
  byte[] bytes = CompressImage(file);
  var result = (file, faces: fc.Face.DetectWithStreamAsync(new MemoryStream(bytes)).GetAwaiter().GetResult());
  (result.faces.Count == 0 ? $"{file} not detect any face!!!" : $"{file} detected {result.faces.Count}.").Dump();
  return (file, faces: result.faces.ToList());
 })
 .SelectMany(x => x.faces.Select(face => (x.file, face)))
 .ToDictionary(x => x.face.FaceId.Value, x => (file: x.file, face: x.face));

斷點(diǎn)續(xù)傳

也如上文所說(shuō)，有成千上萬(wàn)張照片，如果一旦網(wǎng)絡(luò)傳輸異常，或者打翻了桌子上的咖啡（誰(shuí)知道呢？）……或者完全一切正常，只是想再做一些其它的分析，所有東西又要重新開(kāi)始。我們可以加入下載中常說(shuō)的“斷點(diǎn)續(xù)傳”機(jī)制。

其實(shí)就是一個(gè)緩存，記錄每個(gè)文件讀取的結(jié)果，然后下次運(yùn)行時(shí)先從緩存中讀取即可，緩存到一個(gè)json文件中：

Dictionary<Guid, (string file, DetectedFace face)> faces = GetFiles(inFolder)
 .AsParallel() // 加的就是這行代碼
 .Select(file => 
 {
  byte[] bytes = CompressImage(file);
  var result = (file, faces: fc.Face.DetectWithStreamAsync(new MemoryStream(bytes)).GetAwaiter().GetResult());
  (result.faces.Count == 0 ? $"{file} not detect any face!!!" : $"{file} detected {result.faces.Count}.").Dump();
  return (file, faces: result.faces.ToList());
 })
 .SelectMany(x => x.faces.Select(face => (x.file, face)))
 .ToDictionary(x => x.face.FaceId.Value, x => (file: x.file, face: x.face));

注意代碼下方有一個(gè)lock關(guān)鍵字，是為了保證多線程下載時(shí)的線程安全。

使用時(shí)，只需只需在Select中添加一行代碼即可：

var cache = new Cache<List<DetectedFace>>(); // 重點(diǎn)
Dictionary<Guid, (string file, DetectedFace face)> faces = GetFiles(inFolder)
 .AsParallel()
 .Select(file => (file: file, faces: cache.GetOrCreate(file, () => // 重點(diǎn)
 {
  byte[] bytes = CompressImage(file);
  var result = (file, faces: fc.Face.DetectWithStreamAsync(new MemoryStream(bytes)).GetAwaiter().GetResult());
  (result.faces.Count == 0 ? $"{file} not detect any face!!!" : $"{file} detected {result.faces.Count}.").Dump();
  return result.faces.ToList();
 })))
 .SelectMany(x => x.faces.Select(face => (x.file, face)))
 .ToDictionary(x => x.face.FaceId.Value, x => (file: x.file, face: x.face));

將人臉框起來(lái)

照片太多，如果活動(dòng)很大，或者合影中有好幾十個(gè)人，分出來(lái)的組，將長(zhǎng)這個(gè)樣子：

完全不知道自己的臉在哪，因此需要將檢測(cè)到的臉框起來(lái)。

注意框起來(lái)的過(guò)程，也很有技巧，回憶一下，上傳時(shí)的照片本來(lái)就是壓縮和旋轉(zhuǎn)過(guò)的，因此返回的DetectedFace對(duì)象值，它也是壓縮和旋轉(zhuǎn)過(guò)的，如果不進(jìn)行壓縮和旋轉(zhuǎn)，找到的臉的位置會(huì)完全不正確，因此需要將之前的計(jì)算過(guò)程重新演算一次：

using var bmp = Bitmap.FromFile(item.info.file);
HandleOrientation(bmp, bmp.PropertyItems);
using (var g = Graphics.FromImage(bmp))
{
 using var brush = new SolidBrush(Color.Red);
 using var pen = new Pen(brush, 5.0f);
 var rect = item.info.face.FaceRectangle;
 float scale = Math.Max(1.0f, (float)(1.0 * Math.Max(bmp.Width, bmp.Height) / 1920.0));
 g.ScaleTransform(scale, scale);
 g.DrawRectangle(pen, new Rectangle(rect.Left, rect.Top, rect.Width, rect.Height));
}
bmp.Save(Path.Combine(dir, Path.GetFileName(item.info.file)));

使用我上面的那張照片，檢測(cè)結(jié)果如下（有點(diǎn)像相機(jī)對(duì)焦時(shí)人臉識(shí)別的感覺(jué)）：

1000個(gè)臉的限制

.GroupAsync方法一次只能檢測(cè)1000個(gè)FaceId，而上次活動(dòng)800多張照片中有超過(guò)2000個(gè)FaceId，因此需要做一些必要的分組。

分組最簡(jiǎn)單的方法，就是使用System.Interactive包，它提供了Rx.NET那樣方便快捷的API（這些API在LINQ中未提供），但又不需要引入Observable<T>那樣重量級(jí)的東西，因此使用起來(lái)很方便。

這里我使用的是.Buffer(int)函數(shù)，它可以將IEnumerable<T>按指定的數(shù)量（如1000）進(jìn)行分組，代碼如下：

foreach (var buffer in faces
 .Buffer(1000)
 .Select((list, groupId) => (list, groupId))
{
 GroupResult group = await fc.Face.GroupAsync(buffer.list.Select(x => x.Key).ToList());
 var folder = outFolder + @"\gid-" + buffer.groupId;
 CopyGroup(folder, group, faces);
}

總結(jié)

文中用到的完整代碼，全部上傳了到我的博客數(shù)據(jù)Github，只要輸入圖片和key，即可直接使用和運(yùn)行：
https://github.com/sdcb/blog-data/tree/master/2019/20191122-dotnet-face-detection

這個(gè)月我參加了上海的.NET Conf，我上述代碼對(duì).NET Conf的800多張照片做了分組，識(shí)別出了2000多張人臉，我將其中我的照片的前三張找出來(lái)，結(jié)果如下：

......

總的來(lái)說(shuō)，這個(gè)效果還挺不錯(cuò)，渣渣分辨率的照片的臉都被它找到了😂。

注意，不一定非得用Azure Cognitive Services來(lái)做人臉識(shí)別，國(guó)內(nèi)還有阿里云等廠商也提供了人臉識(shí)別等服務(wù)，并提供了.NET接口，無(wú)非就是調(diào)用API，注意其限制，代碼總體差不多。

另外，如有離線人臉識(shí)別需求，Luxand提供了還有離線版人臉識(shí)別SDK，名叫Luxand FaceSDK，同樣提供了.NET接口。因?yàn)闊o(wú)需網(wǎng)絡(luò)調(diào)用，其識(shí)別更快，匹配速度更是可達(dá)每秒5千萬(wàn)個(gè)人臉數(shù)據(jù)，精度也非常高，親測(cè)好用，目前最新版是v7.1.0，授權(quán)昂貴（但百度有驚喜）。

以上就是本文的全部?jī)?nèi)容，希望對(duì)大家的學(xué)習(xí)有所幫助，也希望大家多多支持腳本之家。

您可能感興趣的文章: