Posted on 2021-08-18
Ryax Technologies Ryax Technologies

NSFW image classification

+ More details

Description

This model classifies imagery as either Suitable For Work (SFW) or Not Suitable For Work (NSFW) based upon the presence of pornographic content in an image. It takes an image as its input and returns a JSON output with floating point scores for the model’s determination of the image’s SFW and NSFW probabilities.

Business benefit

This model can be used forensically across an IT system to hunt for unauthorized media. The model could also be used to moderate job data flows and to segregate data when an end user’s job requires the viewing of possibly objectionable content.

Relevant for

  • e-commerce/Digital
  • Customer Experience
  • Fraud prevention
  • General services
  • Detection specialists
  • HR
  • IT departments
  • Law enforcement
  • Forensics
  • Other

Data inputs (mandatory)

▪ Image (100Mo max, .jpg .png .tif)

Data Output

▪ Text file containing a score of NSFWorkedness (1Mo max, text file)

Technical description

PERFORMANCE METRICS
This model wraps Yahoo!’s NSFW Deep Learning Neural Network, which was open sourced in 2016. Metrics were not provided in the Git repo. However, given the exceedingly low Recall rate of current manual methods, and the importance of the model to current client systems, we decided to include the algorithm in the Modzy platform. Performance metrics for custom use may be tuned by adjusting the minimum NSFW probability score at which the end users wishes to flag an image as NSFW. Yahoo! recommends > 0.8 NSFW and < 0.2 NSFW as reasonable default values to consider an image NSFW or SFW respectively.

OVERVIEW
This model was created by Yahoo! Engineering. It was trained by their staff using their own internal dataset of imagery that they considered to be NSFW. Since the NSFW label is subjective and contextually dependent, situations arise where concepts that are inappropriate in one setting are considered appropriate for another e.g. a gory image might be considered NSFW at a children’s book publisher, but commonplace and acceptable at a medical textbook publisher. This lack of an absolute standard for NSFW lead the Yahoo! team to focus this model’s subject matter on the one area that is generally considered NSFW in the majority of workplace environments: pornographic imagery.

The model architecture is a ResNet50 model.

TRAINING
The model was developed by training several versions of its architecture against the 1,000 class ImageNet dataset. The model which performed the best against ImageNet, then underwent transfer learning against Yahoo! identified and editorially labeled datasets. The data used for training this NSFW model consisted of a large collection of pornographic and nonpornographic imagery collected by and housed at Yahoo!. Due to the nature of the training material and copyright concerns, Yahoo! chose not to release the training data.

VALIDATION
The model was validated against a holdout set of the pornographic and nonpornographic imagery housed at Yahoo!. As with the training data, it was not released.
Ryax Technologies

Other listings from Ryax Technologies