advsecurenet.evaluation.evaluators package

advsecurenet.evaluation.evaluators.attack_success_rate_evaluator module

class advsecurenet.evaluation.evaluators.attack_success_rate_evaluator.AttackSuccessRateEvaluator

Bases: BaseEvaluator

Evaluates the attack success rate for adversarial examples. .. note:

This evaluation only considers the samples where the model's initial prediction is correct. This is to ensure that the metrics are not skewed by incorrect initial predictions.
The results are calculated as the number of successful attacks divided by the total number of samples. The range of the results is [0, 1] where 1 indicates that all the attacks were successful and 0 indicates that none of the attacks were successful or there were no samples to evaluate.

get_results() → float

Calculates the attack success rate for the streaming session.

Returns:: The attack success rate for the adversarial examples in the streaming session.
Return type:: float

reset(): Resets the evaluator for a new streaming session.

update(model: BaseModel, original_images: Tensor, true_labels: Tensor, adversarial_images: Tensor, is_targeted: bool | None = False, target_labels: Tensor | None = None)

Updates the evaluator with new data for streaming mode.

Parameters:

model (BaseModel) – The model being evaluated.
original_images (torch.Tensor) – The original images.
true_labels (torch.Tensor) – The true labels for the original images.
adversarial_images (torch.Tensor) – The adversarial images.
is_targeted (bool, optional) – Whether the attack is targeted.
target_labels (Optional[torch.Tensor], optional) – Target labels for the adversarial images if the attack is targeted.

Note

This function only considers the samples where the model’s initial prediction is correct. This is to ensure that the metrics are not skewed by incorrect initial predictions.

advsecurenet.evaluation.evaluators.perturbation_distance_evaluator module

class advsecurenet.evaluation.evaluators.perturbation_distance_evaluator.PerturbationDistanceEvaluator

Bases: BaseEvaluator

Evaluator for the perturbation distance. Supported metrics are: - L0 - L2 - Linf

calculate_l0_distance(original_images: Tensor, adversarial_images: Tensor) → float

Calculates the L0 distance between the original and adversarial images. L0 distance is the count of pixels that are different between the two images (i.e. the number of pixels that have been changed in the adversarial image compared to the original image).

Parameters:

original_images (torch.Tensor) – The original images.
adversarial_images (torch.Tensor) – The adversarial images.

Returns:

The mean L0 distance between the original and adversarial images.

Return type:

float

calculate_l2_distance(original_images: Tensor, adversarial_images: Tensor) → float

Calculates the L2 distance between the original and adversarial images. L2 distance is the Euclidean distance between the two images.

Parameters:

original_images (torch.Tensor) – The original images.
adversarial_images (torch.Tensor) – The adversarial images.

Returns:

The mean L2 distance between the original and adversarial images.

Return type:

float

calculate_l_inf_distance(original_images: Tensor, adversarial_images: Tensor) → float

Calculates the L∞ distance between the original and adversarial images. L∞ distance is the maximum absolute difference between the two images in any pixel.

Parameters:

original_images (torch.Tensor) – The original images.
adversarial_images (torch.Tensor) – The adversarial images.

Returns:

The mean L∞ distance between the original and adversarial images.

Return type:

float

calculate_perturbation_distances(original_images: Tensor, adversarial_images: Tensor) → tuple[float, float, float]

Calculates the L0, L2, and L∞ distances between the original and adversarial images.

Parameters:

original_images (torch.Tensor) – The original images.
adversarial_images (torch.Tensor) – The adversarial images.

Returns:

The mean L0, L2, and L∞ distances between the original and adversarial images.

Return type:

Tuple[float, float, float]

get_perturbation_distance(distance_type: str) → float

Calculates the mean perturbation distance for the streaming session for the specified distance type.

Parameters:: distance_type (str) – The distance type. Valid values are: L0, L2, Linf.
Returns:: The mean perturbation distance for the adversarial examples in the streaming session.
Return type:: float

get_results() → dict[str, float]

Calculates the mean perturbation distances for the streaming session.

Returns:: The mean perturbation distances for the adversarial examples in the streaming session.
Return type:: dict[str, float]

reset(): Resets the evaluator for a new streaming session.

update(original_images: Tensor, adversarial_images: Tensor)

Updates the evaluator with new data for streaming mode. Expects the unnormalized, original distribution of the data.

Parameters:

original_images (torch.Tensor) – The original images.
adversarial_images (torch.Tensor) – The adversarial images.

advsecurenet.evaluation.evaluators.perturbation_effectiveness_evaluator module

class advsecurenet.evaluation.evaluators.perturbation_effectiveness_evaluator.PerturbationEffectivenessEvaluator

Bases: BaseEvaluator

Evaluator for the perturbation effectiveness. The effectiveness score is the attack success rate divided by the perturbation distance. The higher the score, the more effective the attack.

calculate_perturbation_effectiveness_score(attack_success_rate: float, perturbation_distance: float) → float

Calculates the perturbation effectiveness score for the attack. The effectiveness score is the attack success rate divided by the perturbation distance. The higher the score, the more effective the attack. The purpose of this metric is to distinguish between attacks that have a high success rate but require a large perturbation magnitude, and attacks that have a lower success rate but require a smaller perturbation magnitude. :param attack_success_rate: The attack success rate. :type attack_success_rate: float :param perturbation_distance: The perturbation distance. :type perturbation_distance: float

Returns:: The effectiveness score.
Return type:: float

get_results() → float

Calculates the mean perturbation effectiveness score for the streaming session.

Returns:: The mean perturbation effectiveness score for the adversarial examples in the streaming session.
Return type:: float

reset(): Resets the evaluator for a new streaming session.

update(attack_success_rate: float, perturbation_distance: float)

Updates the evaluator with new data for streaming mode.

Parameters:

attack_success_rate (float) – The attack success rate.
perturbation_distance (float) – The perturbation distance.

advsecurenet.evaluation.evaluators.robustness_gap_evaluator module

class advsecurenet.evaluation.evaluators.robustness_gap_evaluator.RobustnessGapEvaluator

Bases: BaseEvaluator

Evaluator for the robustness gap. The robustness gap is the difference between the accuracy of the model on clean and adversarial examples. Currently, this metric doesn’t support targeted attacks and doesn’t have an option to filter out the initially misclassified images.

get_results() → dict[str, float]

Calculates the robustness gap for the streaming session.

Returns:: The robustness gap for the adversarial examples in the streaming session.
Return type:: dict[str, float]

reset(): Resets the evaluator for a new streaming session.

update(model: BaseModel, original_images: Tensor, true_labels: Tensor, adversarial_images: Tensor)

Updates the evaluator with new data for streaming mode.

Parameters:

original_images (torch.Tensor) – The original images.
true_labels (torch.Tensor) – The true labels for the original images.
adversarial_images (torch.Tensor) – The adversarial images.

advsecurenet.evaluation.evaluators.similarity_evaluator module

class advsecurenet.evaluation.evaluators.similarity_evaluator.SimilarityEvaluator

Bases: BaseEvaluator

Calculates the structural similarity index (SSIM) and peak signal-to-noise ratio (PSNR) between the original and adversarial images. This evaluator supports both streaming and non-streaming modes. In streaming mode, the evaluator can be updated with new data and the results are calculated on the fly. In non-streaming mode, the evaluator returns the results for the provided data only.

For streaming mode, the evaluator can be updated with new data using the update method. The results can be obtained using the get_results method.

Note

The SSIM and PSNR metrics expect the images to be in the original range. If the images are normalized, they need to be denormalized before calculating the metrics.

Example

>>> from advsecurenet.evaluation.evaluators.similarity_evaluator import SimilarityEvaluator
>>> with SimilarityEvaluator() as evaluator:
>>>    for batch in data_loader:
>>>        # Generate adversarial images
>>>        ...
>>>        # Update evaluator with new data
>>>        evaluator.update(original_images, adversarial_images)
>>>    # Get results
>>>    ssim_score, psnr_score = evaluator.get_results()

For non-streaming mode, the evaluator can be used as a normal function.

Example

>>> from advsecurenet.evaluation.evaluators.similarity_evaluator import SimilarityEvaluator
>>> similarity_evaluator = SimilarityEvaluator()
>>> ssim_score, psnr_score = similarity_evaluator.calculate_similarity_scores(original_images, adversarial_images)
>>> print(f"SSIM: {ssim_score}, PSNR: {psnr_score}")

calculate_psnr(original_images: Tensor, adversarial_images: Tensor) → float

Calculates the mean peak signal-to-noise ratio (PSNR) between the original and adversarial images. PSNR is a metric that measures the similarity between two images. The higher the PSNR, the more similar the images are and the lower the distortion between them. A high PSNR could indicate that the perturbations introduced are subtle but may not necessarily reflect the perceptual similarity between the images.

Parameters:

original_images (torch.Tensor) – The original images. Expected shape is (batch_size, channels, height, width).
adversarial_images (torch.Tensor) – The adversarial images.

Returns:

The mean PSNR between the original and adversarial images. [0, inf) range. Higher values (e.g. 30 dB or more) indicate better quality. Infinite if the images are identical.

Return type:

float

calculate_similarity_scores(original_images: Tensor, adversarial_images: Tensor) → tuple[float, float]

Calculates the SSIM and PSNR between the original and adversarial images.

Parameters:

original_images (torch.Tensor) – The original images. Expected shape is (batch_size, channels, height, width).
adversarial_images (torch.Tensor) – The adversarial images.

Returns:

The mean SSIM and PSNR between the original and adversarial images.

Return type:

Tuple[float, float]

calculate_ssim(original_images: Tensor, adversarial_images: Tensor) → float

Calculates the mean structural similarity index (SSIM) between the original and adversarial images. SSIM is a metric that measures the similarity between two images. The higher the SSIM, the more similar the images are.

Parameters:

original_images (torch.Tensor) – The original images. Expected shape is (batch_size, channels, height, width).
adversarial_images (torch.Tensor) – The adversarial images.

Returns:

The mean SSIM between the original and adversarial images. [-1, 1] range. 1 means the images are identical.

Return type:

float

get_psnr() → float

Calculates the mean PSNR between the original and adversarial images for all the data seen so far.

Returns:: The mean PSNR between the original and adversarial images. [0, inf) range. Higher values (e.g. 30 dB or more) indicate better quality. Infinite if the images are identical.
Return type:: float

get_results() → dict[str, float]

Calculates the mean SSIM and PSNR between the original and adversarial images for all the data seen so far.

Returns:: A dictionary containing the mean SSIM and PSNR between the original and adversarial images.
Return type:: dict[str, float]

get_ssim() → float

Calculates the mean SSIM between the original and adversarial images for all the data seen so far.

Returns:: The mean SSIM between the original and adversarial images. [-1, 1] range. 1 means the images are identical.
Return type:: float

reset(): Resets the evaluator.

update(original_images: Tensor, adversarial_images: Tensor)

Updates the evaluator with new data for both SSIM and PSNR.

Parameters:

original_images (torch.Tensor) – The original images. Expected shape is (batch_size, channels, height, width).
adversarial_images (torch.Tensor) – The adversarial images.

update_psnr(original_images: Tensor, adversarial_images: Tensor, update_total_images: bool = True)

Updates the evaluator with new data.

Parameters:

original_images (torch.Tensor) – The original images. Expected shape is (batch_size, channels, height, width).
adversarial_images (torch.Tensor) – The adversarial images.

update_ssim(original_images: Tensor, adversarial_images: Tensor, update_total_images: bool = True)

Updates the evaluator with new data.

Parameters:

original_images (torch.Tensor) – The original images. Expected shape is (batch_size, channels, height, width).
adversarial_images (torch.Tensor) – The adversarial images.

advsecurenet.evaluation.evaluators.transferability_evaluator module

class advsecurenet.evaluation.evaluators.transferability_evaluator.TransferabilityEvaluator(target_models: List)

Bases: BaseEvaluator

Evaluates the transferability of adversarial examples generated by a source model to a list of target models.

Parameters:: target_models (List) – List of target models to evaluate the transferability of adversarial examples to.

transferability_data

Dictionary containing the transferability data for each target model.

Type:: dict

get_results() → dict

Calculates the transferability results for the streaming session and returns them.

Returns:: Transferability results for each target model.
Return type:: dict

reset(): Resets the evaluator for a new streaming session.

update(model: BaseModel, original_images: Tensor, true_labels: Tensor, adversarial_images: Tensor, is_targeted: bool = False, target_labels: Tensor | None = None) → None

Update the transferability evaluator with new data.

Parameters:

model (BaseModel) – The model to evaluate transferability on.
original_images (torch.Tensor) – The original images.
true_labels (torch.Tensor) – The true labels of the original images.
adversarial_images (torch.Tensor) – The adversarial images.
is_targeted (bool, optional) – Whether the attack is targeted or not. Defaults to False.
target_labels (torch.Tensor, optional) – The target labels for targeted attacks. Defaults to None.

Returns:

None