Mitigation

Goose can optionally attempt to mitigate Coordinated Omission by back-filling the metrics with the statistically expected requests. To do this, it tracks the normal "cadence" of each GooseUser, timing how long it takes to loop through all GooseTasks in the assigned GooseTaskSet. By default, Goose will trigger Coordinated Omission Mitigation if the time to loop through a GooseTaskSet takes more than twice as long as the average time of all previous loops. In this case, on the next loop through the GooseTaskSet when tracking the actual metrics for each subsequent request in all GooseTasks it will also add in statistically generated "requests" with a response_time starting at the unexpectedly long request time, then again with that response_time minus the normal "cadence", continuing to generate a metric then subtract the normal "cadence" until arriving at the expected response_time. In this way, Goose is able to estimate the actual effect of a slowdown.

When Goose detects an abnormally slow request (one in which the individual request takes longer than the normal user_cadence), it will generate an INFO level message (which will be visible if Goose was started with the -v run time flag, or written to the log if started with the -g run time flag and --goose-log is configured).

Examples

An example of a request triggering Coordinate Omission mitigation:

13:10:30 [INFO] 11.401s into goose attack: "GET http://apache/node/1557" [200] took abnormally long (1814 ms), task name: "(Anon) node page"
13:10:30 [INFO] 11.450s into goose attack: "GET http://apache/node/5016" [200] took abnormally long (1769 ms), task name: "(Anon) node page"

If the --request-log is enabled, you can get more details, in this case by looking for elapsed times matching the above messages, specifically 1,814 and 1,769 respectively:

{"coordinated_omission_elapsed":0,"elapsed":11401,"error":"","final_url":"http://apache/node/1557","method":"Get","name":"(Anon) node page","redirected":false,"response_time":1814,"status_code":200,"success":true,"update":false,"url":"http://apache/node/1557","user":2,"user_cadence":1727}
{"coordinated_omission_elapsed":0,"elapsed":11450,"error":"","final_url":"http://apache/node/5016","method":"Get","name":"(Anon) node page","redirected":false,"response_time":1769,"status_code":200,"success":true,"update":false,"url":"http://apache/node/5016","user":0,"user_cadence":1422}

In the requests file, you can see that two different user threads triggered Coordinated Omission Mitigation, specifically threads 2 and 0. Both GooseUser threads were loading the same GooseTask as due to task weighting this is the task loaded the most frequently. Both GooseUser threads loop through all GooseTasks in a similar amount of time: thread 2 takes on average 1.727 seconds, thread 0 takes on average 1.422 seconds.

Also if the --request-log is enabled, requests back-filled by Coordinated Omission Mitigation show up in the generated log file, even though they were not actually sent to the server. Normal requests not generated by Coordinated Omission Mitigation have a coordinated_omission_elapsed of 0.

Coordinated Omission Mitigation is disabled by default. It can be enabled with the --co-mitigation run time option when starting Goose. It can be configured to use the average, minimum, or maximum GooseUser cadence when backfilling statistics.