File Types
The following container types are supported as both upload containers and for processing:
Extension | Kind of Document | Media/MIME Type |
---|---|---|
.gz | GZip compressed archive | application/gzip |
.bz2 | ||
.iso | ||
.mbox | ||
.pst | Microsoft Outlook .pst files 2000 / 2003 / 2007 / 2010 / 2013 / 2016 / 2019 / 2021 / Outlook 365 | |
.rar | RAR archive | application/vnd.rar |
.tar | Tar archive | |
.vdi | ||
.vhd | ||
.vmdk | ||
.z | GNU-compressed files | |
.zip | ZIP archive | application/zip |
.7z | 7-zip archive | application/x-7z-compressed |
The following file types are supported and tested by Canopy for processing. Canopy takes a best effort approach for processing files not contained on this list, therefore, supported files extend beyond this list:
Extension | Kind of Document | Media/MIME Type |
---|---|---|
.accdb | Microsoft Access 2007+ | |
.bak, .mdf | Microsoft SQL | |
.bmp | Bitmap | image/bmp |
.csv | CSV (Comma-Separated Values) | text/csv |
.dat | DAT (DAT files) | |
.dcm | Digital Imaging and Communications in Medicine (DICOM) | |
.doc | Microsoft Word | application/msword |
.docx | Microsoft Word XML Format | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
.dot | Microsoft Word Template | application/msword |
.dotm | Microsoft Document Macro Enabled Template | application/vnd.ms-word.template.macroenabled.12 |
.dotx | Microsoft Document XML Template | application/vnd.openxmlformats-officedocument.wordprocessingml.template |
.eml | Emails | message/rfc822 |
.gif | Graphics Interchange Format (GIF) | image/gif |
.gz | GZip compressed archive | application/gzip |
.heif/.heic | High Efficiency File Format (HEIF) Family | image/heic; image/heic-sequence; image/heif; image/heif-sequence |
.iso | ||
.jpg, jpeg, jpe | Joint Photographic Experts Group (JPEG) | image/jpeg |
.key | Apple iWorks Keynote | applications/x-iwork-keynote-sffkey |
.mbox | ||
.mdb | Microsoft Access Format < 2007 | |
.msg | Microsoft Message File | |
.numbers | Apple iWorks Numbers < version 12 | applications/x-iwork-numbers-sffnumbers |
.pages | Apple iWorks Pages | applications/x-iwork-pages-sffpages |
Adobe Portable Document Format | application/pdf | |
.png | Portable Network Graphics (PNG) | image/png |
.pot | Microsoft PowerPoint Template | application/vnd.ms-powerpoint |
.potm | Microsoft PowerPoint Macro Enabled Template | application/vnd.ms-powerpoint.template.macroenabled.12 |
.potx | Microsoft PowerPoint XML Template | application/vnd.openxmlformats-officedocument.presentationml.template |
.pps | Microsoft PowerPoint Slide Show | application/vnd.ms-powerpoint |
.ppsm | Microsoft PowerPoint Macro Enabled Slide Show | application/vnd.ms-powerpoint.slideshow.macroenabled.12 |
.ppsx | Microsoft PowerPoint XML Slide Show | application/vnd.openxmlformats-officedocument.presentationml.slideshow |
.ppt | Microsoft PowerPoint | application/vnd.ms-powerpoint |
.pptm | Microsoft PowerPoint Macro Enabled | application/vnd.ms-powerpoint.presentation.macroenabled.12 |
.pptx | Microsoft PowerPoint XML Format | application/vnd.openxmlformats-officedocument.presentationml.presentation |
.pst | Microsoft Outlook .pst files 2000 / 2003 / 2007 / 2010 / 2013 / 2016 / 2019 / 2021 / Outlook 365 | |
.psv | PSV (Pipe-Separated Values) | text/plain; charset=ISO-8859-1 |
.rar | RAR archive | application/vnd.rar |
.sas7bdat | Statistical Analysis System (SAS) Database | application/x-sas-data |
.tiff | Tag Image File Format | image/tiff |
.tsv, .tab | TSV / TAB (Tab-Separated Values) | text/tab-separated-values; charset=ISO-8859-1 |
.txt | TXT (Text files) | text/plain |
.vdi | ||
.vhd | ||
.vmdk | ||
.wpd | WordPerfect 6 | application/vnd.wordperfect; version=6.x |
.w51 | WordPerfect 5.1 | application/vnd.wordperfect; version=5.1 |
.xla | Microsoft Excel Add-Ins | application/vnd.ms-excel |
.xlam | Microsoft Excel Macro-Enabled | application/vnd.ms-excel.addin.macroenabled.12 |
.xls | Microsoft Excel | application/vnd.ms-excel |
.xlsb | Microsoft Excel Binary Macro Enabled | application/vnd.ms-excel.sheet.binary.macroenabled.12 |
.xlsx | Microsoft Excel XML Format | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet |
.xlt | Microsoft Excel Template | application/vnd.ms-excel |
.xltx | Microsoft Excel XML Template | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet |
.xlw | Microsoft Excel Workspace | application/vnd.ms-excel |
.z | GNU-compressed files | |
.zip | ZIP archive | application/zip |
.7z | 7-zip archive | application/x-7z-compressed |
The following unsupported files will force fail during processing:
File type | Extensions |
---|---|
Database | .sqlite |
Mailbox Files | .ost, .nsf |
Spreadsheets | .numbers version >= 12 |
Presentations | .ppam |
Images | .svg |
Zeiss CSI images | .czi |
Aperio SVS images | .svs |
Aperion fluorescent images | .afi |
Once the processing pipeline determines a file matching one of the following Media Types, the file will be marked as skipped and removed from further processing.
Currently, processing skips the following files by media type:
Media/MIME Types | Non Exhaustive Extension List | Kind of Document |
---|---|---|
application/atom+xml | .atom | Atom Syndication Format |
application/epub+zip | .epub | Electronic publication (EPUB) |
application/font-sfnt | .ttf, .otf, .ttc | TrueType or OpenType |
application/geotopic | .xml, .rdf, .json, .html, .csv | ISO/TS 19139-1:2019 Geographic Information |
application/java-vm | .class | Java Class File |
application/octet-stream | .bin | Uninterpreted binary |
application/pkcs7-signature application/pkcs7-mime |
.p7c, .p7z, .p7s, p7m | PKCS #7 digital signatures and certificates |
application/rss+xml | .rss, .xml, .rdf | RDF Site Summary (RSS) |
application/step | .st, .step, .stp | ISO-10303 STEP data |
application/timestamped-data | .tsd | TimeStampedData |
application/vnd.ms-fontobject | .eot | Embedded OpenType (EOT) |
application/vnd.ms-htmlhelp | .chm | Microsoft Compiled HTML Help (CHM) |
application/x-dosexec | .exe | DOS/Windows executable (EXE) |
application/x-elf | .elf | Executable and Linkable Format (ELF) |
application/x-font-adobe-metric | .afm, .amfm, .acfm | Adobe Multiple Font Metrics Format Files |
application/x-font-ttf | .ttf | TrueType Font |
application/x-font-type1 | .pfa, .pfb | PostScript Type 1 Fonts |
application/x-hdf | .hdf, .he5, .h5 | Hierarchical Data Format File |
application/x-matlab-data | .mat | MATLAB Files |
application/x-msdownload | .exe, .dll, .ocx, .msi, .msp, .cab, .bat, .com, .scr | Portable Executable (PE) |
application/x-msdownload; format=pe32 | .exe, .dll, .ocx, .sys | Portable Executable (PE) format for 32-bit Windows |
application/x-msdownload; format=pe64 | exe, .dll, .ocx, .sys | Portable Executable (PE) format for 64-bit Windows |
application/x-netcdf | .nc, .cdf | Network Common Data Form |
application/x-object | .o, .obj, .coff | Object code files |
application/x-sharedlib | .so, .dll | Shared library files |
multipart/appledouble | Apple Double Resource Files | |
text/x-c++ | .cpp, .cc, .cxx, .c++, .h, .hpp, .hxx, .hh | C++ Source OCde and Header Files |
text/x-c++src | .cpp, .cxx, .cc, .C, .c++, .CPP | C++ Source Code |
Currently, processing skips the following files by extension:
- .out
- .pack
- .pbxproj
- .abcdp
- .xcuserstate
In older projects, once the processing pipeline determines a file has one of the following extensions, the file will be marked as skipped and removed from further processing:
- .afm
- .atom
- .axf
- .bin
- .c++
- .cc
- .chm
- .class
- .cpp
- .cxx
- .dat
- .dll
- .elf
- .eot
- .epub
- .exe
- .fb2
- .fbz
- .geot
- .hdf
- .ibooks
- .iso
- .ko
- .mat
- .mod
- .nc
- .o
- .p7c
- .p7m
- .p7s
- .pfa
- .pfb
- .prx
- .puff
- .rss
- .sfnt
- .so
- .tsd
- .ttf
- .woff
- .woff2