If you were eating a soup from a bowl with 500ml of soup taking 25ml spoonfuls, and the rain replaced the volume that you ate at the same rate as you ate it, how many spoon fulls would it take for the soup to be completely replaced with water? Also, when that happens, would it still be the same soup?


Ok, I couldn’t resist. Here are the results of running that simulation 6000 times.
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's 1085 1126 1140 1144 1157 1292 0This means that about half the time it’s somewhere between 1126 and 1157 spoons.
FYI: /u/protist@mander.xyz, /u/Sabin10@lemmy.world, /u/neo2478@sh.itjust.works
You rock! Thank you :)
If I find myself in the right mood I might try to work out the actual distribution. If I do, your simulation will be a very handy sanity check!
Thanks!
If you want, I can increase the sample size. Just need to figure out how to add a timer to the code and set it to run for a few hours, maybe even overnight. A histogram with 10^6 bins should look pretty smooth.
Should also upgrade the visualization to ggplot2. Frivolous computations like this deserve no less.
By the way, how did you actually stimulate it? Surely you didn’t keep 10^25 variables in memory…
I thought of making a vector with a length of about 1.671398e+25, but then I remembered what one time when when I tried to make a linear model with hundreds of dimensions. So yeah… We have gigabytes of RAM, and it’s still not enough. Not really a problem, as long as you don’t try to do anything completely ridiculous.
Instead, I just made a variable that simply contains the number of soup molecules and another one for the number of water molecules. Far simpler that way.
Here’s where the magic happens:
# Number of soup molecules drawn soup_molecules_replaced <- rbinom(1, replacement_count, prob_soup)The rbinom function is used to generate random numbers from a binomial distribution. It’s a discrete probability distribution that models the number of successes, i.e. scooping out a soup molecule. Rest of the codes is just basic infrastructure like variables, loops, etc.
BTW the variable names look ugly, because I couldn’t be bothered to tidy everything up. I really prefer camelCase, whereas Mistral seems to prefer underscores. That’s what you get for vibing.
Side note: If you do this kind of stuff for private purposes, you have to rely on your own hardware. If you plan to publish your discoveries, universities and publicly funded supercomputers might be an option. If there exists a Journal Of Recreational Mathematics And Useless Simulations (JORMAUS), I could totally publish this stuff and maybe even run my code on a supercomputer.
Interesting. I don’t know why I didn’t think of just keeping a count of soup molecules. Must have been late!
Another interesting point, your simulation is subtly wrong in a different way from my calculation. When there is only one soup molecule left, there is a chance (however tiny) that
rbinomwill return 2 or more, taking out more soup molecules than there really are.If you run it enough times with a bowl of 3 molecules and a spoon of 2 molecules, I’m sure you’ll hit -1 soup molecules some of the time.
For a simulation I think we can do better. There must be a random function that does it properly. The function we want is like pulling balls of 2 colors out of a sack without replacement. Pretty common combinatorics question, I would expect a random function to match.
You’re right. I just ran rbinom 1E7 times and found that the probability of over drawing soup molecules is a bit too high for my taste.
When there’s only 1 left, you usually end up drawing 0 or 1 molecule. However, in rare cases, it can be higher, such as 2, 3, 4… molecules.
About 92% was 0, and 7.7% was 1, but the others were not negligible! There’s about 0.3% probability of over drawing, which is way too high for a simulation as serious as this one. In this quick test, there were 20 incidents where rbinom wanted to pull out 4 soup molecules when only 1 was available. We can’t have that, now can we!
In python the closest I could find was (untested): sum(random.sample([1, 0], spoon_size, counts=[soup_count, water_count]))
But this would create an intermediate list of length spoon_size which is not a good idea.
/u/TranquilTurbulence@lemmy.zip fairly sure the distribution you should use is hypergeometric distribution, found via urn problem.
Hmmm… The description certainly fits. Just by eye-balling the graphs, they look very different from what I got, but I guess that’s just the expected result of running rbinom about a 6 million times. With a smaller simulation, it might not have been so apparent. Also, that’s what you get for skipping the maths and vibing the code without thinking too much about the details. Well, at least i got this far with absolutely minimal effort. :D
It appears that I need to switch to a better distribution. Thanks for looking into this mystery!