Audio transcript and materials used on 14th October 2021 for part Ib Computer Science Concurrent and Distributed Systems Examples Class 1.
Token passing mutual exclusion. ------------------------------- int g_worklock = 0; const int N_PARTICIPANTS = 4; // A thread calls this when it wants its work executed // with exclusive access to whatever g_worklock is protecting. int do_my_work(void (*workptr)()) { int my_tid = gettid(); // while(g_worklock != my_tid) { wait(); } wait(g_worklock == my_tid); if (workptr) workptr(); g_worklock = (1+g_worklock) % N_PARTICIPANTS; }
Atomic-free token passing design attempt:
Does this code work?
What assumptions do we make?
Can we improve on the spinning? (wait on arbitrary condition?)
Is token passing access control a useful idea in general?
2:00 Does it work?
4:00 Memory write consistency (advanced question)
5:00 Is it good code?
6:00 Spinning while waiting is a waste of time?
8:00 Re-evaluation of wait guard predicates.
9:00 What can a threads package usefully wait on?
10:00 [BTW: It's quick to know whether the nth prime is odd or even!]
12:00 Fairness/social behaviour by regularly calling do_my_work().
Let's look at it in a message-passing scenario:
while(1) { struct packet *rxd = packetwait(network_ifc_handle); switch (rxd->packet_type) { case worktoken: if (i_have_some_work) do_some_of_my_work(); packetsend(worktoken, (my_address+1)% N_PARTICIPANTS); break; case ...: ...dispatch other packets accordingly; break; } }
If this were in distributed system, would this be a good idea
(compared with a parallel computing platform?). What
other consideration do we need to make?
0:00 No shared variable in this phrasing.
1:00 Actor model, message passing to circulate the token.
2:00 Packet send primitive.
2:28 Any change in assumptions made?
3:00 Unbounded latency or unreliable communication.
3:42 Supercomputer interconnect is ulta low latency.
4:40 Loosing the token: parallel computing vs distributed computing.
5:00 Clarification of sendpacket primitive.
6:31 Who issues the first token?
// test0.c #includeint main() { int i; int ss = 0; for (i=0; i<10; i++) ss += i; //#ifdef CBMC __CPROVER_assert(ss > 20, "my postcondition"); __CPROVER_assert(ss > 200, "second postcondition"); //#endif printf("ss=%i\n", ss); return 0; }
Well this could be just checked with 'dynamic validation' since there are no inputs.
0:30 Run as a C program
1:30 As processed by cbmc model checker
2:11 Dynamic validation - showing code is not incorrect by running tests
2:20 Only one execution path so not much gain from formal proof!
Test 1 - A digression into model checking and not much to do with concurrency (yet).
// test1.c Now with undetermined inputs extern unsigned int reader(); int main() { int P = 2 + (reader() & 0xF/*FFFFF*/); // Add further F values to increase the complexity. int Q = 2 + (reader() & 0xF/*FFFFFF*/); int ss = P * Q; __CPROVER_assert(ss >=4, "postcondition1"); __CPROVER_assert(ss >=20, "postcondition2"); return 0; }
CMBC example output without two 16-bit inputs TXT.
How many more possible execution paths are there for each F added?
0:30 Undetermined input sourced from outside our model of interest (the program)
1:48 Proof that (2..17) * (2..17) > 4.
2:12 A bigger proof (30 bits of input)
2:30 An even bigger one that fails, why? Integer overflow.
// test2.c An example containing an unbounded? loop. int main() { int i, N = 2 + (reader()&0xF); int ss = 0; for (i=0; i<N; i++) { ss += 3*i; } // Is this loop bounded? { __CPROVER_assert(ss >=3, "postcondition"); } return 0; } // But we can see that it will have at most 17 iterations. // cbmc test2.c --unwind 100 // This course is just giving you a taste of model checking. The differences // between bounded and unbounded model checking are certainly not examinable on this course.
test3.c - demonstrating scheduller non-determinism - pthreads Demo in C.
Perhaps experiment with different levels of optimisation (-O0 or -O2) or the volatile keyword?
#include <pthread.h> #include <stdio.h> /* volatile */ int g_shared_var = 1; const int LIMIT = 1000; void *alice(void *arg) { printf("Address of Alice's stack %p\n", &arg); for (int p=0; p<LIMIT; p++) { g_shared_var += 1; } return 0; } void *bob(void *arg) { printf("Address of Bob's stack %p\n", &arg); for (int p=0; p<LIMIT; p++) { g_shared_var *= 501; } return 0; } int main() { pthread_t pt_alice, pt_bob; void *alice_result = 0; void *bob_result = 0; pthread_create(&pt_alice, NULL, alice, 0); pthread_create(&pt_bob, NULL, bob, 0); pthread_join(pt_alice, &alice_result); pthread_join(pt_bob, &bob_result); printf("g_shared_var=%i\n", g_shared_var); return 0; // eof
What sources of non-determinacy in test3 are there?
0:45 Viewing the address of local variables on the stack.
1:16 pthread_create and join used to orchestrate two child threads.
1:14 What am I trying to demonstrate?
2:56 [ Response about a lack of cache consistency deleted.]
3:20 A lack of cache consistency will not be demonstrated since my laptop works.
4:30 Rant about cache consistency not being green!
5:18 Examination of the output from different runs.
5:30 We see that stack layout varies from run-to-run.
5:50 Code, stack, heap and static data segments discussion.
7:10 We see some repeated answers. Few different ones.
8:08 cat /proc/cpuinfo - how did the O/S map the threads to cores?
9:01 We have unique answers from longer runs.
9:59 Does a program that takes a long time to run take a long time to model check.
Some example output, showing bimodal output in this case
gcc -lpthread test3.c;./a.out;./a.out;./a.out;./a.out;./a.out;./a.out;./a.out Address of Alice's stack 0x7f5576fe3ed8 Address of Bob's stack 0x7f5576652ed8 g_shared_var=-697947867 Address of Alice's stack 0x7f15055d9ed8 Address of Bob's stack 0x7f1504c48ed8 g_shared_var=1621556425 Address of Alice's stack 0x7fe98ea73ed8 Address of Bob's stack 0x7fe98e0e2ed8 g_shared_var=1621556425 Address of Alice's stack 0x7f7cc77f8ed8 Address of Bob's stack 0x7f7cc6e67ed8 g_shared_var=1621556425 Address of Alice's stack 0x7f3c24e26ed8 Address of Bob's stack 0x7f3c24495ed8 g_shared_var=1621556425 Address of Alice's stack 0x7fc983698ed8 Address of Bob's stack 0x7fc982d07ed8 g_shared_var=1621556425 Address of Alice's stack 0x7fc14c12bed8 Address of Bob's stack 0x7fc14b79aed8 g_shared_var=1621556425
// test4.c #include <pthread.h> #include <stdio.h> int g_ss = 0; void *alice(void *arg) { g_ss = 6; return (void *)3; } void *bob(void *arg) { g_ss = 1; return (void *)2; } int main() { pthread_t pt_alice, pt_bob; void *alice_result = 0; void *bob_result = 0; pthread_create(&pt_alice, NULL, alice, 0); pthread_create(&pt_bob, NULL, bob, 0); pthread_join(pt_alice, &alice_result); pthread_join(pt_bob, &bob_result); int rtot = (long int) alice_result + (long int)bob_result; printf("rtot=%i g_ss=%i\n", rtot, g_ss); #ifdef CBMC __CPROVER_assert(g_ss == 1, "postcondition ss"); __CPROVER_assert(rtot == 5, "postcondition tot"); #endif return 0; }
test4.c : There was not time to discuss this final example in the class.