Document {Blob Path} Has Unsupported Content Content Type in Azure Search Indexer

If you have done some development with Azure Search and Azure Blob Storage, then you may also have the similar experience. For me, I have an Azure Search Indexer, and it is pointing to Azure Blob Storage. The Azure Search Indexer is reporting “Document ‘{blob Path}’ has unsupported content type ‘unsupported'” after the indexer runs.

From the Microsoft Docs on Indexing Documents in Azure Blob Storage with Azure Search , it states that the supported document formats are,

  • PDF
  • Microsoft Office formats: DOCX/DOC, XLSX/XLS, PPTX/PPT, MSG (Outlook emails)
  • HTML
  • XML
  • ZIP
  • EML
  • RTF
  • Plain text files (see also Indexing plain text)
  • JSON (see Indexing JSON blobs)
  • CSV (see Indexing CSV blobs preview feature)

And the above error message is stated when the format is txt, msg, and html. I have asked around and someone from Microsoft ask me to test by starting very simple content until I hit into error. But I found out it fails except blank content in the file.

After few months on the trying, testing, and back and forth the comments, my boss give me a deadline as we cannot wait to launch the application. So in the end, the only way to make it works, is turn off the “FailOnUnsupportedContentType” using the REST API.

"parameters": { 
    "configuration": { 
       "indexedFileNameExtensions" : ".html,.txt,.pdf,.docx",
       "excludedFileNameExtensions": ".bmp,.dib,.png,.jpeg,.jpg,.jpe,.jfif,.gif,.tif,.tiff,.ico",
      "failOnUnsupportedContentType" : false
     }
 },

After that, now my indexer is running good, all new uploaded blob can get all data I want into Azure Search Index. Hope this could help you. And if you know the possible reason why these formats (txt, msg, html) are listed as supported but it keeps generating error as unsupported, please leave me message. I will also come to this topic if I found any updates.

2 Replies to “Document {Blob Path} Has Unsupported Content Content Type in Azure Search Indexer”

  1. Hi!

    I have the same issue with docx files. I indexed ~2500 files but ~100 of this word documents gave me this error as well. Document has unsupported content type ‘unsupported’.

    This failed docs really look the same as the others so no idea either where it goes wrong.

    Regards Axel

    1. Hi Axel,
      If you turn off the “FailOnUnsupportedContentType”, then all files should be indexed. Currently, my project has moved from using Azure Search into Azure hosted Solr Search. Somehow it provides better performance and don’t have these errors.

      Regards,
      Ken Lin

Leave a Reply to Ken Lin Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.