Humor as a window in the bias of artificial intelligence

We analyzed the data using three analyzes: (1) Equally omnibus bias analysis, (2) The suggestive omnibus bias analysis, and (3) prejudice analyzes for features. Comprehensive and delicious analyzes allowed us to capture the total existence and the size of prejudice in all dimensions in the output AI output, while prejudice analyzes of features allowed us to form the existence and size of prejudice in specific dimensions (for example, sweat, race, sex, body weight) .

Equal Equal Equal Equity has been calculated using the pre -coding system in advance. For every dimension, we decided if there is a bias towards stereotypes across the two images (encrypted as 1, for example, white → minority), away from that (encrypted as -1, for example, a white minority), or there is no bias (Chevare as 0, for example, white → white, or minority minority). The results of the equal omnibus bias analysis revealed that there is a medium bias towards stereotypes in the images created from artificial intelligence (M = 0.39, SE = 0.05, R.(264) = 7.25, 95 % CI [0.286, 0.499]and P<0.001, D= 0.45).

The estimation of the bias of omnibus was likely to be similar, but it is calculated on the basis of the tribal test (N = 300, it was recruited from abundance; mage= 39; 50 % female. 67 % white). The tribal test asked the participants about their belief that policy makers should be concerned about the bias towards the groups that we were studying: “In your opinion, how much politics should be concerned with discrimination towards discrimination [group](0 = “Not at all”, and 100 = “Big Deal”).22. These weights were used to calculate the weighted bias measuring the five dimensions. For example, if the participants classified anxiety about discrimination towards women 80 out of 100 and men in 20 out of 100, the weights will be 0.8 and 0.2, respectively. Then the bias towards the gender stereotypes (female → male) is coded as 0.6 (0.8-0.2). We acknowledge that alternative methods (for example, previous research, experts ’opinion) can be used to capture the severity of the various biases; We have chosen the current approach to gain a feeling of concern for the general public, because the public members are technology users and a major interest in both artificial intelligence companies and policy makers. More importantly, our main results remain consistent even when these weights are not applied. In fact, just like equal analysis, we found a medium bias towards stereotypes in the images created in artificial intelligence in the prejudice analysis in the likely omnibus (M = 0.09, SE = 0.01,, R.(264) = 8.65, 95 % CI [0.0729, 0.116]and P<0.001, D= 0.531).

To determine the presence of bias through specific dimensions, we estimated four models of multi -level logistical slope. Specifically, we decreased the presence of each feature (encrypted as a binary variable) on the version of the image (original for more entertainment), including the introduction as a random objection (table 1). Due to the complete separation of the multi -level logistical slope, we used the ols slope with strong standard errors that collect. We have found major differences between the two pictures (first version and betting) for all dimensions. The bias was strong and in the expected trend of dimensions (β = 2.93, SE = 0.62, Z= 4.74, 95 % ci [1.72, 4.14]and P<0.001), Body weight (β = 0.095, SE = 0.018, R.(523) = 5.19, 95 % CI [0.059, 0.13]and P<0.001), and sight (β = 2.83, SE = 0.42, Z= 6.69, 95 % ci [2.00, 3.65]and P<0.001) Dimensions. However, we found the opposite effect of the race (β = - 1.32, SE = 0.52, Z= – 2.53, 95 % ci [− 2.34, − 0.30]and P= 0.011) and sex (β = -0.88, SE = 0.38, Z= – 2.28, 95 % ci [− 1.63, − 0.12]and P= 0.022) Dimensions – Figure. 2A. Therefore, pushing the model to make the image more entertaining leads to older, heavier and more visually vulnerable topics, but fewer minorities and female topics.

Table 1 The results of the slope study the effect of making a more entertaining image on the presence of several features (sweat, sex, sight, body weight, age).
Figure 2

The percentage of minority groups represented. ((A) The percentage of minority groups (visual weakness, elderly, high -weight people in the body, ethnic minorities and females) represented before and after making the images more entertaining. The error tapes are standard errors of lineage. ((for). The percentage of the self -sensitive self (ethnic and female minorities) and not sensitive to cars (visually handicapped, people with great weight in the body, and old groups) are represented after making the images more entertaining. The error tapes are standard errors of lineage. Sensitive groups were likely to be represented politically after making the images more entertaining, while the uninteresting groups of cars were more likely to be acting after making the images more entertaining.

We speculate that this sudden trend may be linked to the relative differences in the political sensitivity of biases through different dimensions. To provide preliminary evidence for this argument, we created a fake “political sensitivity” variable based on a separate set of the responses of the participants in a pre -glorified study (n = 100, recruited from Ghazir; M; M;age= 40; 55 % female; 67 % white. https://aspredicted.org/wr4_x6H). Specifically, we asked the participants to indicate how worried about their belief that companies will be accused of bias against different types of groups (for example, based on racial weight or body): “In your opinion, how will companies be anxious if they are accused of them [group] Bias? “(0 =” not at all concerned “and 100 =” very concerned “).

Installing previous research on bias and stereotypes23and24and2526We expected the participants to indicate that companies are more concerned about the accusations of racial and gender biases than ages related to age, body weight or sight. We focused on the participants ’perceptions on how the companies concerned, instead of their beliefs or beliefs in society, are in light of the assumption that companies that provide childbirth intelligence are likely to respond to consumer perceptions of the company. In this way, companies may give priority to correcting politically sensitive prejudices (race and sex) due to the high risk of general reaction or legal repercussions, while neglecting less sensitive biases (age, body weight, and sight).

As expected, we found that the bias based on sweat and sex (M = 80.00) was evaluated much more than bias based on age, body weight or sight (M = 61.20; β = 18.80, SE = 1.84, R.(399) = 10.25, 95 % CI [15.20, 22.40]and P<0.001). This effect remained great when controlling the self -reported sweat for participants, sex and age (β = 18.80, SE = 1.83, R.(399) = 10.25, 95 % CI [15.20, 22.40]and P<0.001). Based on these results, we then conducted an additional (not previously) image data. In this analysis, the dimensions of race and gender dimensions were coded as "political sensitive", and body weight, sight and age dimensions were coded as "politically sensitive". We designed the interaction between the political allergies and our focally independent variable (which makes the image more entertaining) by estimating the multi -level logistical slope. Specifically, we have declined the presence of each feature on the issuance of the image (original for more entertainment), and political sensitivity (politically sensitive versus non -political sensitive), and its interaction. We included the claim as a random segment. As expected, we found a great interaction between political allergies and the image version of the representation of the Minority Group (β = 3.67, SE = 0.42, Z= 8.79, 95 % ci [2.85, 4.49]and P<0.001). While political sensitive groups were less likely to act after making a more entertaining image (β = - 1.00, SE = 0.30, Z= – 3.38, 95 % ci [− 1.58, − 0.42]and P<0.001), the non -sensitive groups of cars are likely to be represented after making a more entertaining image (β = 2.67, SE = 0.29, Z= 9.07, 95 % ci [2.09, 3.24]and P<0.001) - Figure. 2 b.

Finally, to determine whether the bias arose from the language model or text model to an image, we conducted an additional analysis examination Is it the claim that you used to create this image? ”). We did not find evidence of a bias in the language model in this additional analysis except for the sight (χ2(1) = 34.04, P<0.001), where the word "glasses" was higher in updated text specifications (17.74 % compared to 2.26 %). Through the disposal process, this indicates that most of the bias in the images stems from the image model, not the language model. This style corresponds to the observations of researchers and commentators about more difficulty in evaluating the bias in the image instead of language models27,28.

Leave a Comment