[Script] Calculate the probability to reach a price, e.g. 100k :)

Bitcoin Forum

January 06, 2026, 03:57:45 PM

Welcome, Guest. Please login or register.

News: Latest Bitcoin Core release: 30.0 [Torrent]

Home

Help

Bitcoin Forum > Economy > Economics > Speculation > [Script] Calculate the probability to reach a price, e.g. 100k :)

Pages: [1]

« previous topic next topic »

Author

Topic: [Script] Calculate the probability to reach a price, e.g. 100k :) (Read 273 times)

d5000 (OP)

Legendary

Offline

Activity: 4508
Merit: 10106

Decentralization Maximalist

[Script] Calculate the probability to reach a price, e.g. 100k :)

November 02, 2024, 07:44:24 PM
Last edit: November 08, 2024, 06:45:04 AM by d5000

Merited by virginorange (5), buwaytress (1), mole0815 (1), DirtyKeyboard (1)

Do you want to know how likely it is for Bitcoin to reach 100.000 USD before 2025, or 200.000 before 2026?

I have written a Python script which lets you calculate the probability based on previous price movements. For example, you can take a timeframe from the past where you suspect that the market was in a similar situation than currently. Many for example think we're in a mid-bull market, thus interesting years were 2016/17 or 2020/21.

How does the script work? It takes the percentual daily price changes in the time frame you select, and assigns to each the same probability. For example, if you use 2016/17, then it takes the daily change from 2016-01-01 to 2016-01-02 until 2017-12-30 to 2017-12-31, i.e. 730 days (2 x 365). Each of these changes has exactly a probability of 1/730.

You now tell the script how many days you want to simulate. For example, until January 1, 2025 we have 60 days (as of November 2). And then you tell it also the current Bitcoin price, e.g. $69500.

The script now applies one of the price changes from the data series, and repeats that as many times as the number of days you specified. If the target price is reached, then the script will increase a counter. Then it will tell you how many times the target price has been reached, compared to the total number of simulations.

The idea came from this thread, unfortunately it was closed while the discussion about the script was still ongoing. So I open a thread exclusively for that script.

(Interesting posts in the linked thread are this one, this one and above all this one by user @DirtyKeyboard, which includes an alternative script which works very similar).

Interesting input came also from @Tubartuluk in the German subforum. So it's better to use a longer series of data.

This is of course not a 100% scientific approach. It's basically a simplified Monte Carlo simulation. But it should be done just for fun, and not for investment decisions!

Here's the script. Save it as price-probabilities.py together with a CSV file with the price series. DirtyKeyboard has provided this CSV with Bitstamp data since 2012 in the other thread. If you don't know Python, simply save it as "prices.csv" in the same folder than the script and it will work if the first column in the CSV file is called "Price" and the second "Date" and contains the price and date.

Call the script with (Python 3.8+ required):

python price-probabilities.py STARTPRICE TARGET_PRICE DAYS [SIMULATIONS]

STARTPRICE: the current price or another price you want to start the simulation at.
TARGET_PRICE: the price you want to know the probability (e.g. 100000)
DAYS: number of days to simulate. For example, if you want to know the probability in the next year, use 365.
SIMULATIONS is optional and is the number of simulations (default: 10000, the higher the better).

In the script you have to specify the date range for the base series. Currently it includes the years 2020 and 2021, i.e. the latter 2 years of the previous bull market. You manipulate this with the STARTDATE and ENDDATE variables in the script itself (didn't want to add too many command line parameters).

Edit: An expanded script including graph and histogram is available in this post

Code:

import sys, random, csv, datetime
from decimal import Decimal

# usage: python price_probabilities.py STARTPRICE ENDPRICE DAYS [SIMULATIONS]
# STARTPRICE is the price you use as a base, e.g. the current BTC/USD price
# ENDPRICE is the price you want to know the probability of being reached

CSV = True # mark as False to use the example series below
CSVFILE = "prices.csv" # replace with file name with csv data
STARTDATE = "2020-01-01" # start date of the series
ENDDATE = "2021-12-31" # end date of the series
DEBUG = False

# example series from sept 22 to oct 23, 2024

SEP22OCT23 = [0.92, -1.13, 0.08, -2.40, 0.96, -0.07, 1.49, -0.30, 0.81, 1.51, 5.11, -0.53, 1.13, 3.62, -0.52, -2.46, -0.14, -0.91, 1.22, -0.03, 2.19, 0.17, -0.31, -3.95, -3.46, -0.39, 0.14, 0.92, 3.20, -1.71, 1.46, -0.38]

def csv_to_series(csvfilename, startdatestr, enddatestr, debug=False):
    startdate = datetime.date.fromisoformat(startdatestr)
    enddate = datetime.date.fromisoformat(enddatestr)
    with open(csvfilename, "r") as csvfile:
        reader = csv.DictReader(csvfile)
        prices = [r for r in reader]
    change_series = []
    prev_price = None
    for day_data in prices:
        date = datetime.date.fromisoformat(day_data["Date"])
        if date < startdate:
            continue
        if date > enddate:
            break

        price = Decimal(str(day_data["Price"]))
        if prev_price is None:
            if debug:
                print(price)
            prev_price = price
            continue
        change = Decimal(price / prev_price)
        change_series.append(change)
        prev_price = price
        if debug:
            print(price, change)
    if debug:
        print(change_series)
    return change_series


# price probability

start_price = Decimal(str(sys.argv[1]))
end_price = Decimal(str(sys.argv[2]))
days = int(sys.argv[3])
if len(sys.argv) > 4:
    simulations = int(sys.argv[4])
else:
    simulations = 10000


base_price_series = SEP22OCT23

if CSV:
    price_series = csv_to_series(CSVFILE, STARTDATE, ENDDATE, DEBUG)
else:
    price_series = [Decimal((100 + i)/100) for i in base_price_series]

passed_sims = 0
for sim in range(simulations):

    price = start_price
    sim_days = []
    for day in range(days):
        price_change = random.choice(price_series)
        price = price * price_change
        sim_days.append(price)

    if DEBUG:
          print("Simulation {}: {}".format(sim, sim_days))
    for dprice in sim_days:
        if dprice >= end_price:
            passed_sims += 1
            break

probability = Decimal(passed_sims * 100 / simulations)

print("Price: {} in {} simulations".format(end_price, simulations))
print("Reached in {} simulations".format(passed_sims))
print("Probability: {} %".format(float(probability)))

(Edited: removed unnecessary print statements from the script)

.
.^{Duelbits PREDICT}..

.
.^{WHERE EVERYTHING IS A MARKET}..

█████
██
██

██
██
██████

Will Bitcoin hit $200,000
before January 1st 2027?
^No @1.15 ^Yes @6.00

█████
██
██

██
██
██████

^{CHECK MORE >}

DirtyKeyboard

Hero Member

Offline

Activity: 868
Merit: 1378

Fly free sweet Mango.

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

November 04, 2024, 02:48:39 AM

Merited by d5000 (5), mole0815 (1)

Nice. I'm working on a little script to keep that pastebin file updated, and I added some graphing.
I figured I would run till the end of this year, using the end of last year's data.

Enter start date (YYYY-MM-DD): 2023-11-03
Enter end date (YYYY-MM-DD): 2023-12-31
Number of days to simulate: 58
Starting price: 68442
Number of simulations: 100

Interesting spread. It ends in profit, but that bottom blue series doesn't look like fun.

██████████████████████████ ▟█ █████████████████████████████████████████████████████████████████
█████████████████████████ ▟██ ████████████████████████████████████████████████████████████████
███████████████████████ ⚞▇▇▋███▎▎(⚬)> ███████████████████████████████████████████████████████████████
█████████████████████████ ▜██ ████████████████████████████████████████████████████████████████
██████████████████████████ ▜█ █████████████████████████████████████████████████████████████████
███████████████████████████ ███████████████████████████████████████████████████████████████████

d5000 (OP)

Legendary

Offline

Activity: 4508
Merit: 10106

Decentralization Maximalist

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

November 07, 2024, 05:25:59 AM

I've tried your script with the graph plotting now and it works awesome

I've adapted it a bit so it can be used with command line parameters, like my original script. Is it okay for you if I publish it here? (Otherwise I'll adapt my own script adding the CSV saving and plotting parts.)

Now as we already saw a new all time high I fed the script with a $75000 starting price. Some scenarios based on the whole last bull market (2019-01-01 to 2021-12-31) [1]:

- 35% for 100000 until the end of the year
- 60% for 100000 in 100 days
- 93% for 100000 in 365 days Wink

- 53% for 200000 in 365 days Grin

Unfortunately talkimg.com doesn't work for me at the moment, so I couldn't upload the pics Sad

I've also modified the script so it also can calculate the probability of crashes (i.e. the price falling below a certain threshold). This could be used to do some hedging.

For an example run, I used the same data and calculated the probability to crash to 50.000 until the end of the year. The probability was only of 3,1% which looks fine

Even 60000 is not really likely, with 16%.

[1] I think the timeframe you chose in your last post is too short, as it is only a single small leg of a bull market, and also an extremely bullish one. As we don't know in which stage of the bull market we are, I thus chose the last entire bull market, including the crashes near the start and near the end.

.
.^{Duelbits PREDICT}..

.
.^{WHERE EVERYTHING IS A MARKET}..

█████
██
██

██
██
██████

Will Bitcoin hit $200,000
before January 1st 2027?
^No @1.15 ^Yes @6.00

█████
██
██

██
██
██████

^{CHECK MORE >}

DirtyKeyboard

Hero Member

Offline

Activity: 868
Merit: 1378

Fly free sweet Mango.

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

November 07, 2024, 05:57:12 AM
Last edit: November 07, 2024, 06:25:39 AM by DirtyKeyboard

Merited by d5000 (5)

I know, I know, you're all saying, "Why didn't you use the end of the year from 4 years ago?" Good question. I think. Anyway, this is all for fun speculation.

Enter start date (YYYY-MM-DD): 2020-11-07
Enter end date (YYYY-MM-DD): 2020-12-31
Number of days to simulate: 54
Starting price: 74057
Number of simulations: 10000
Number of simulations reaching $100,000 or more: 9592
Probability of reaching $100,000 or more: 95.92%

Wow! Who got in early? Like, as in yesterday early? Grin

Here is the AI teamwork makes the LLM dream work script. I'm more of a terminal input person, but mix and match as you like. I added a histogram too. I think the auto format would work with less spread in the results. With the small time frame, I was just trying to have fun matching the days 1:1

Code:

from datetime import datetime, timedelta
import csv, random
import matplotlib.dates as mdates
import pandas as pd
import matplotlib.pyplot as plt

# Get user inputs for date range
start_date = input("Enter start date (YYYY-MM-DD): ")
end_date = input("Enter end date (YYYY-MM-DD): ")

# Read the CSV file into a DataFrame

df = pd.read_csv("G:/Other computers/My Laptop/PyProjects/Top20/Top20_Total_03_24/VWAP_USD/result_with_timestamp.csv")

# Filter the DataFrame based on the user-defined date range
df = df[(df['Column 2 Name'] >= start_date) & (df['Column 2 Name'] <= end_date)]

# Calculate the percent change between days
df['Percent_Change'] = df['Column 1 Name'].pct_change() * 100

# Create a list of percent changes, dropping the first NaN value
percent_changes = df['Percent_Change'].dropna().tolist()

# Print the first few percent changes to verify
print(percent_changes[:5])

# Get user inputs
days = int(input("Number of days to simulate: "))
starting_price = float(input("Starting price: "))
number_of_simulations = int(input("Number of simulations: "))
wins = 0

# Initialize a list to store daily prices for averaging
daily_prices = [[] for _ in range(days)]

# Open the CSV file for writing results
with open('C:/PyProjects/Predictor/simulation_results.csv', mode='w', newline='') as file:
    writer = csv.writer(file)
    writer.writerow(['Simulation', 'Date', 'Price'])  # Write header
    
    for sim in range(number_of_simulations):
        # Initialize the price list with the starting price for each simulation
        prices = [starting_price]
        reached_100k = False

        # Initialize the starting date
        current_date = datetime.now()  # or set to a specific start date
        writer.writerow([sim + 1, current_date.strftime('%Y-%m-%d'), starting_price])
        # Simulate price changes
        for day in range(days):
            percent_change = random.choice(percent_changes)
            new_price = prices[-1] * (1 + percent_change / 100)
            prices.append(new_price)

            current_date += timedelta(days=1)
            # Store the simulation number, date, and new price
            writer.writerow([sim + 1, current_date.strftime('%Y-%m-%d'), new_price])

            # Store the new price for averaging
            daily_prices[day].append(new_price)

            # Check if price has exceeded 100,000 at any point
            if new_price > 100_000 and not reached_100k:
                reached_100k = True
                wins += 1

        # Print the results for the simulation
        print(f"\nSimulation {sim + 1}:")
        print(f"Final price after {days} days: ${prices[-1]:.2f}")
        print(f"Total change: {((prices[-1] - starting_price) / starting_price * 100):.2f}%")
        print(f"Reached $100,000: {'Yes' if reached_100k else 'No'}")

print(f"\nNumber of simulations reaching $100,000 or more: {wins}")
print(f"Probability of reaching $100,000 or more: {(wins / number_of_simulations) * 100:.2f}%")

# Create a figure and axis for the plot
plt.figure(figsize=(12, 6))
results_df = pd.read_csv('C:/PyProjects/Predictor/simulation_results.csv')
results_df.columns = results_df.columns.str.strip()  # Remove any leading/trailing whitespace
results_df['Date'] = pd.to_datetime(results_df['Date'])

# Loop through each unique simulation number
for sim in results_df['Simulation'].unique():
    sim_data = results_df[results_df['Simulation'] == sim]
    plt.plot(sim_data['Date'], sim_data['Price'], marker='o')  # Group by simulation number

# Add labels and title
plt.xlabel('Date')
plt.ylabel('Price')
plt.title('Price vs Date for Each Simulation')
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
plt.xticks(rotation=45)  # Rotate x-axis labels for better readability
plt.grid()  # Add grid for better readability
plt.tight_layout()
plt.gca().set_xlim(left=results_df['Date'].min(), right=results_df['Date'].max())
plt.subplots_adjust(left=0.1, right=0.9, top=0.9, bottom=0.15)

# Mirror the y-axis on the right
ax = plt.gca()
ax2 = ax.twinx()
ax2.set_ylabel('Price')
ax2.yaxis.set_label_position('right')
ax2.yaxis.tick_right()
ax2.set_ylim(ax.get_ylim())

# Export the plot
plt.show()

# Uncomment 2 lines below and replace with actual path
# file_name = f"{current_date:%m}-{current_date:%d}-{current_date:%H}-{current_date:%M}-{current_date:%S}.png" 
# plt.savefig(f'C:/PyProjects/Predictor/{file_name}')

# Create a figure and axis for the plot
plt.figure(figsize=(12, 6))
results_df = pd.read_csv('C:/PyProjects/Predictor/simulation_results.csv')

# Extract the last price for each simulation
last_prices = results_df.groupby('Simulation')['Price'].last()

# Create bins for the price ranges
bins = range(int(last_prices.min()), int(last_prices.max()) + 1000, 1000)

# Create a histogram of the last prices
plt.figure(figsize=(12, 6))
counts, _, patches = plt.hist(last_prices, bins=bins, edgecolor='black')

# Calculate min, max, and mode
min_price = last_prices.min()
max_price = last_prices.max()
mode_price = last_prices.mode()[0]  # Get the first mode if there are multiple

# Label the columns with min, max, and mode prices
plt.text(min_price, counts.max() * 0.9, f'Min: {min_price:.0f}', color='blue', fontsize=13)
plt.text(max_price - len(bins)*150, counts.max() * 0.9, f'Max: {max_price:.0f}', color='green', fontsize=13)
# plt.text(mode_price, counts.max() * 0.8, f'Mode: {mode_price}', color='green', fontsize=10)

# Set labels and title
plt.xlabel('Price on Last Day of Each Simulation')
plt.ylabel('Number of Simulations')
plt.title('Histogram of Simulations by Price on Last Day')

# Show the plot
plt.show()

# Uncomment 2 lines below and replace with actual path
# file_name = f"{current_date:%m}-{current_date:%d}-{current_date:%H}-{current_date:%M}-{current_date:%S}_2C.png" 
# plt.savefig(f'C:/PyProjects/Predictor/{file_name}')

EDITED: Thanks and if the pictures loaded, I heard imgBB is working. I might have also fixed the auto scaling.

d5000 (OP)

Legendary

Offline

Activity: 4508
Merit: 10106

Decentralization Maximalist

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

November 08, 2024, 06:42:37 AM

Merited by DirtyKeyboard (2)

Thanks for the histogram function, looks awesome

I have now expanded the script with some command line options, the graph and histogram functions and the ability to save the file to CSV. The CSV format of my script is however simpler than the one in yours: It stores each simulation as a separate row, this looks more spreadsheet-friendly to me, if someone wants to import the data into Excel, Calc etc..

Two known little problems, I hope to solve them soon:

- histogram function creates two figures instead of one
- if the graph is shown, then it is not stored correctly (it seems the plot is blanked out). So I added a "--show" option, currently you can only either store the graph or show it.

See the Usage section in the script for the command line options.

Code:

import sys, random, csv, datetime
from decimal import Decimal
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
"""Usage:

python price_probabilities.py STARTPRICE TARGET_PRICE DAYS [-n SIMULATIONS] [-s STARTDATE_SERIES] [-e ENDDATE_SERIES] [--debug] [--store] [--ext] [--graph] [--hist] [--show]

If the STARTPRICE is below the TARGET_PRICE, then it calculates the probability to surpass that price.
If the STARTPRICE is instead equal or above the TARGET_PRICE, then it calculates the probability to go below that price (useful to roughly estimate a crash risk)."""

# default values
CSV = True
CSVFILE = "prices.csv"
STARTDATE = "2019-01-01"
ENDDATE = "2021-12-31"
SIMULATIONS = 10000
DEBUG = False

# series from sept 22 to oct 23, 2024

SEP22OCT23 = [0.92, -1.13, 0.08, -2.40, 0.96, -0.07, 1.49, -0.30, 0.81, 1.51, 5.11, -0.53, 1.13, 3.62, -0.52, -2.46, -0.14, -0.91, 1.22, -0.03, 2.19, 0.17, -0.31, -3.95, -3.46, -0.39, 0.14, 0.92, 3.20, -1.71, 1.46, -0.38]

def csv_to_series(csvfilename, startdatestr, enddatestr, debug=False):
    startdate = datetime.date.fromisoformat(startdatestr)
    enddate = datetime.date.fromisoformat(enddatestr)
    with open(csvfilename, "r") as csvfile:
        reader = csv.DictReader(csvfile)
        prices = [r for r in reader]
    change_series = []
    prev_price = None
    for day_data in prices:
        date = datetime.date.fromisoformat(day_data["Date"])
        if date < startdate:
            continue
        if date > enddate:
            break

        price = Decimal(str(day_data["Price"]))
        if prev_price is None:
            if debug:
                print(price)
            prev_price = price
            continue
        change = Decimal(price / prev_price)
        change_series.append(change)
        prev_price = price
        if debug:
            print(price, change)
    if debug:
        print(change_series)
    return change_series

def store_to_csv(data, csvfilename, extended=False):
    with open(csvfilename, "w") as csvfile:
        writer = csv.writer(csvfile)
        if extended:
            writer.writerow(['Simulation', 'Day', 'Price'])
            for sim, row in enumerate(data):
                for day, result in enumerate(row):
                    writer.writerow([sim + 1, day + 1, round(float(result), 2)])
        else:
            writer.writerow(["Day {}".format(d + 1) for d in range(days)])
            for row in data:
                writer.writerow([round(float(day), 2) for day in row])

def create_dataframe(data):

    startprice_sim = round(float(start_price), 2)
    data_list = []
    for sim, days in enumerate(data):
        data_list.append({"Simulation" : sim, "Date" : startdate_sim, "Price" : startprice_sim})
        for day, price in enumerate(days):
            date = startdate_sim + datetime.timedelta(days=day + 1)
            data_list.append({"Simulation" : sim, "Date" : date, "Price" : price})

    return pd.DataFrame(data_list)


def store_to_graph(data, graphfilename, show=False):
    # Create a figure and axis for the plot
    plt.figure(figsize=(12, 6))
    dataframe = create_dataframe(data)

    for sim in dataframe["Simulation"].unique():
        sim_data = dataframe[dataframe['Simulation'] == sim]
        plt.plot(sim_data['Date'], sim_data['Price'], marker='o') # Group by simulation number

    max_date, min_date = dataframe["Date"].max(), dataframe["Date"].min()

    # Add labels and title
    plt.xlabel('Date')
    plt.ylabel('Price')
    plt.title('Price vs Date for Each Simulation')
    plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
    plt.xticks(rotation=45)  # Rotate x-axis labels for better readability
    plt.grid()  # Add grid for better readability
    plt.tight_layout()
    plt.gca().set_xlim(left=min_date, right=max_date)
    plt.subplots_adjust(left=0.1, right=0.9, top=0.9, bottom=0.15)

    # Mirror the y-axis on the right
    ax = plt.gca()
    ax2 = ax.twinx()
    ax2.set_ylabel('Price')
    ax2.yaxis.set_label_position('right')
    ax2.yaxis.tick_right()
    ax2.set_ylim(ax.get_ylim())

    # Export the plot
    if show:
        plt.show()
    else:
        plt.savefig(graphfilename)

def store_to_histogram(data, graphfilename, show=False):

    # Create a figure and axis for the plot
    plt.figure(figsize=(12, 6))
    dataframe = create_dataframe(data)
    # results_df = pd.read_csv('C:/PyProjects/Predictor/simulation_results.csv')

    # Extract the last price for each simulation
    last_prices = dataframe.groupby('Simulation')['Price'].last()

    # Create bins for the price ranges
    bins = range(int(last_prices.min()), int(last_prices.max()) + 1000, 1000)

    # Create a histogram of the last prices
    plt.figure(figsize=(12, 6))
    counts, _, patches = plt.hist(last_prices, bins=bins, edgecolor='black')

    # Calculate min, max, and mode
    min_price = last_prices.min()
    max_price = last_prices.max()
    mode_price = last_prices.mode()[0]  # Get the first mode if there are multiple

    # Label the columns with min, max, and mode prices
    plt.text(min_price, counts.max() * 0.9, f'Min: {min_price:.0f}', color='blue', fontsize=13)
    plt.text(max_price - len(bins)*150, counts.max() * 0.9, f'Max: {max_price:.0f}', color='green', fontsize=13)
    # plt.text(mode_price, counts.max() * 0.8, f'Mode: {mode_price}', color='green', fontsize=10)

    # Set labels and title
    plt.xlabel('Price on Last Day of Each Simulation')
    plt.ylabel('Number of Simulations')
    plt.title('Histogram of Simulations by Price on Last Day')

    # Show the plot
    if show:
        plt.show()
    else:
        plt.savefig(graphfilename)

def readarg(flag, default=None):
    if flag in sys.argv:
        index = sys.argv.index(flag)
        if type(default) == bool:
            return True
        if len(sys.argv) >= (index + 1):
            result = sys.argv[index + 1]
            if type(default) == int:
                return int(result)
            else:
                return result
    return default

# main script

start_price = Decimal(str(sys.argv[1]))
end_price = Decimal(str(sys.argv[2]))
days = int(sys.argv[3])
simulations = readarg("-n", 10000)
start_date = readarg("-s", STARTDATE)
end_date = readarg("-e", ENDDATE)
debug = readarg("--debug", DEBUG)
store = readarg("--store", False)
extended = readarg("--ext", False)
graph = readarg("--graph", False)
histogram = readarg("--hist", False)
show = readarg("--show", False) # Shows the graph instead of storing it. Both doesn't seem to work :(
startdate_sim_str = readarg("-d", None) # You can set an alternative start date than today, mostly for the graphs.
sd = datetime.date.fromisoformat(startdate_sim_str) if startdate_sim_str else datetime.date.today()
startdate_sim = datetime.datetime(sd.year, sd.month, sd.day)

base_price_series = SEP22OCT23

if CSV:
    price_series = csv_to_series(CSVFILE, start_date, end_date, debug)
else:
    price_series = [Decimal((100 + i)/100) for i in base_price_series]

passed_sims = 0
all_sims = []

# Take into account that this loop does not add the start price to the sims.
for sim in range(simulations):

    price = start_price
    sim_days = []
    for day in range(days):
        price_change = random.choice(price_series)
        price = price * price_change
        sim_days.append(price)
    if store or graph or histogram:
        all_sims.append(sim_days)

    if debug:
        print("Simulation {}: {}".format(sim, sim_days))
    for dprice in sim_days:
        if ((end_price > start_price and dprice >= end_price) or
            (start_price >= end_price and dprice <= end_price)):
            passed_sims += 1
            break

probability = Decimal(passed_sims * 100 / simulations)

print("Price: {} in {} simulations".format(end_price, simulations))
print("Reached in {} simulations".format(passed_sims))
print("Probability: {} %".format(float(probability)))

if store:
    csv_filename = "{}-{}-{}-{}-{}-{}sims.csv".format(start_price, end_price, days, start_date, end_date, simulations)
    store_to_csv(all_sims, csv_filename, extended=extended)

if graph:
    graph_filename = "{}-{}-{}-{}-{}-{}sims.png".format(start_price, end_price, days, start_date, end_date, simulations)
    store_to_graph(all_sims, graph_filename, show=show)

if histogram:
    histogram_filename = "{}-{}-{}-{}-{}-{}sims-histogram.png".format(start_price, end_price, days, start_date, end_date, simulations)
    store_to_histogram(all_sims, histogram_filename, show=show)

.
.^{Duelbits PREDICT}..

.
.^{WHERE EVERYTHING IS A MARKET}..

█████
██
██

██
██
██████

Will Bitcoin hit $200,000
before January 1st 2027?
^No @1.15 ^Yes @6.00

█████
██
██

██
██
██████

^{CHECK MORE >}

DirtyKeyboard

Hero Member

Offline

Activity: 868
Merit: 1378

Fly free sweet Mango.

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

November 10, 2024, 03:15:44 AM
Last edit: November 10, 2024, 08:56:11 AM by DirtyKeyboard

Merited by d5000 (3)

I'd like to add a new approach. You know how they say past performance doesn't dictate future results. What if they're wrong?

Here is a script that asks how many days in the past, from today, you want to match the % change of BTC volume weighted average price since 2012. Then use that to predict the future.

Step 1: go through the date range, and looking for a sequence of past_percent_changes that best matches the recent_percent_changes.
Step 2: put the lists side by side and add the delta between each day in the sequence and use the sum of the deltas as a closeness ranking
Step 3: move one day forward and look at the new closeness ranking of that past_percent_changes against the recent_percent_changes.
Step 4: using the best matching past day's result dates, look at what happend next (in the past).
Step 5: apply those next past percent changes to today's starting price sequentially to see what might happen in the future.

So, for example, the best match for the past week is from 2023-06-18 to 2023-06-25, so what happened from the 26th on for the next 7 days?

The Close score is the sum of the differences in the percent changes divided by the number of days chosen to match.

Close score: (0.79538) Range start date: 2023-06-18
Close score: (0.79579) Range start date: 2021-09-28
Close score: (0.88344) Range start date: 2023-01-18
Close score: (0.95793) Range start date: 2020-04-20
Close score: (0.96967) Range start date: 2020-04-03

Close score: (1.31444) Range start date: 2019-03-02
Close score: (1.38094) Range start date: 2012-04-29
Close score: (1.38704) Range start date: 2016-08-03
Close score: (1.38887) Range start date: 2023-09-28
Close score: (1.39045) Range start date: 2016-03-06

I think these graphs are say, "Don't get scared next week, hold on for longer term profits!"

Code:

# from datetime import datetime, timedelta
import pyperclip
import matplotlib.dates as mdates
import pandas as pd
import matplotlib.pyplot as plt
import pandas as pd

start_date = '2012-01-01' #input("Enter start date (YYYY-MM-DD): ")
end_date = '2024-11-09' #input("Enter end date (YYYY-MM-DD): ")

# Read the CSV file into a DataFrame
df = pd.read_csv("C:/PyProjects/Predictor/result_with_timestamp.csv")

# Filter the DataFrame based on the user-defined date range
df = df[(df['Column 2 Name'] >= start_date) & (df['Column 2 Name'] <= end_date)]

# Calculate the percent change between days
df['Percent_Change'] = df['Column 1 Name'].pct_change() * 100

# Get user inputs
days = int(input("Number of past days to match: "))

# Get the most recent sequence of percent changes
recent_percent_changes = df['Percent_Change'].tail(days).dropna().tolist()

# Initialize a list to store closeness rankings
closeness_rankings = []

# Iterate through the DataFrame to find matching sequences
for i in range(len(df) - days + 1):
    # Get the current sequence of percent changes
    past_percent_changes = df['Percent_Change'].iloc[i:i + days].dropna().tolist()
    
    # Ensure we have a valid sequence
    if len(past_percent_changes) == days:
        # Calculate the deltas
        deltas = [abs(recent - past) for recent, past in zip(recent_percent_changes, past_percent_changes)]
        
        # Calculate the sum of deltas as the closeness ranking
        closeness_ranking = sum(deltas)
        
        # Store the result with the starting index
        closeness_rankings.append((i, closeness_ranking, past_percent_changes))

# Sort the closeness rankings by the ranking value
closeness_rankings.sort(key=lambda x: x[1])  # Sort by closeness ranking

n = 1
post = []
colors = ['black', 'red', 'orange', 'blue', 'green']
for x in colors:
    if len(closeness_rankings) > 1:
        second_best_index, second_best_ranking, past_changes = closeness_rankings[n]
        starting_date = df['Column 2 Name'].iloc[second_best_index]
    else:
        print("Not enough closeness rankings found.")
        exit()

    close_score = (second_best_ranking/days)

    print(f'Closeness ({close_score}): {starting_date}')
    series = f'Close score: ({close_score:.5f}) [color={x}]Range start date: {starting_date}[/color]'
    post.append(series)

    # Get the next percent changes starting from 'second_best_index + days'
    next_percent_changes = df['Percent_Change'].iloc[second_best_index + days:second_best_index + days + days].dropna().tolist()

    # Set the starting price as the last price in the DataFrame
    starting_price = float(df['Column 1 Name'].iloc[-1])

    # Initialize a list to store predicted prices
    predicted_prices = [starting_price]

    # Simulate the price changes using the next percent changes
    for percent_change in next_percent_changes:
        new_price = predicted_prices[-1] * (1 + percent_change / 100)
        predicted_prices.append(new_price)

    last_date = pd.to_datetime(df['Column 2 Name'].iloc[-1])
    predicted_dates = pd.date_range(start=last_date, periods=len(predicted_prices), freq='D')

    # Prepare data for saving to CSV
    predicted_df = pd.DataFrame({
        'Date': predicted_dates,
        'Predicted_Price': predicted_prices
    })

    # Save the predicted prices to a CSV file
    predicted_df.to_csv(f'C:/PyProjects/Predictor/predicted_simulation{n}.csv', index=False)
    n += 1

import pandas as pd
import matplotlib.pyplot as plt

plt.figure(figsize=(12, 6))

# Function to plot data
def plot_data(plt, file, label, color):
    results_df = pd.read_csv(file)
    results_df.columns = results_df.columns.str.strip()  # Strip whitespace from column names
    results_df['Date'] = pd.to_datetime(results_df['Date'])  # Convert 'Date' to datetime
    plt.plot(results_df['Date'], results_df['Predicted_Price'], label=label, color=color)
    return results_df  # Return the DataFrame for later use

# Plot each predicted simulation and collect DataFrames
dfs = []
dfs.append(plot_data(plt, 'C:/PyProjects/Predictor/predicted_simulation1.csv', 'Best Past Match', 'black')) 
dfs.append(plot_data(plt, 'C:/PyProjects/Predictor/predicted_simulation2.csv', '2nd Best', 'red'))
dfs.append(plot_data(plt, 'C:/PyProjects/Predictor/predicted_simulation3.csv', '3rd Best', 'orange'))
dfs.append(plot_data(plt, 'C:/PyProjects/Predictor/predicted_simulation4.csv', '4th Best', 'blue'))
dfs.append(plot_data(plt, 'C:/PyProjects/Predictor/predicted_simulation5.csv', '5th Best', 'green'))

# Set x-axis limits based on the min and max dates from all DataFrames
all_dates = pd.concat([df['Date'] for df in dfs])  
plt.gca().set_xlim(left=all_dates.min(), right=all_dates.max())

# Add labels and title
plt.xlabel('Date')
plt.ylabel('Predicted Price')
plt.title(f'Predicted Prices for Best Past Matches over {days} days in the past')
plt.xticks(rotation=45)  
plt.grid()  
plt.tight_layout()

# Show the plot
plt.legend()
plt.show()

comment = f"{post}"
print (comment)
pyperclip.copy(comment)

Things got a bit messy trying to put everything on one graph.

Edit: I can't seem to figure out why this script fails when the days are more than somewhere around 250. The script doesn't fail, it just doesn't output enough predicted_prices. It's so frunstrating...er fun and frustrating, that I can't find where things are going sideways. Any help? For example

Close score: (1.86067) Range start date: 2023-03-17
Close score: (1.95362) Range start date: 2023-05-20
Close score: (1.95761) Range start date: 2023-03-25
Close score: (1.96330) Range start date: 2023-03-16
Close score: (1.98632) Range start date: 2023-03-18

Do you see where the red series ends early? To clarify, it ends early because the csv file ends early, so why does that happen?

d5000 (OP)

Legendary

Offline

Activity: 4508
Merit: 10106

Decentralization Maximalist

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

November 10, 2024, 08:26:42 PM

Merited by DirtyKeyboard (1)

That's a quite different approach, but nevertheless interesting

It is already not so much a probability-based prediction, but a prediction by similarity. It's a bit like these Wall Observer users and "TA specialists" pasting an old chart into a new chart

It would be more close to a probability based simulation if you took the 100 most similar price performances instead of only 5, and simulated based on them. I'll see if I can play around a bit with that.

Another idea would be to try to find Elliott wave-style movements - basically look for similarities in the the more long-term movements of a moving average like SMA or EMA.

Quote

Edit: I can't seem to figure out why this script fails when the days are more than somewhere around 250. The script doesn't fail, it just doesn't output enough predicted_prices.

I've looked at it and it's always the same series, only shift by some days, and it's actually the last days of the price data (this is why the price row ends, as you probably already found out if I interpret your last sentence correctly).

~~So it seems that despite of the days that are "missing", these are still the most similar price series. This means that other series are probably not similar at all.~~ (no, this seems not to be the case, see below)

I have found out that the last part of the algorithm (after the closeness rankings are calculated) seems not to work correctly, because if I list the most close price patterns (for 300 days) manually, I get dates from 2023, 2016 and 2022 (in this order by frequency), and these are not "cut off" by the limit date of the CSV file (in my case in October 2024).

For those wanting to copy the script it may also important to add that the original CSV pastebin has the column names "Price" (col 1) and "Date" (column 2), instead of "Column 1 Name" and "Column 2 Name", so you must replace these items. The "paperclip" module can also imo be left out as this is not part of the standard Python library and not needed for the simulations. Just comment out the last line (or the last 3 if you don't need the "comment" part).

.
.^{Duelbits PREDICT}..

.
.^{WHERE EVERYTHING IS A MARKET}..

█████
██
██

██
██
██████

Will Bitcoin hit $200,000
before January 1st 2027?
^No @1.15 ^Yes @6.00

█████
██
██

██
██
██████

^{CHECK MORE >}

DirtyKeyboard

Hero Member

Offline

Activity: 868
Merit: 1378

Fly free sweet Mango.

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

November 11, 2024, 04:49:44 AM
Last edit: November 11, 2024, 06:21:56 AM by DirtyKeyboard

Merited by d5000 (2)

Quote from: d5000 on November 10, 2024, 08:26:42 PM

That's a quite different approach, but nevertheless interesting

It is already not so much a probability-based prediction, but a prediction by similarity. It's a bit like these Wall Observer users and "TA specialists" pasting an old chart into a new chart

Thanks. Thank you for looking at it.

Here's a lot cleaner version with the correct headers, and using home directory filenames. It only attempts 1 graph. Still haven't found why it goes haywire with more than around 250 days.

Code:

import matplotlib.pyplot as plt
import pandas as pd
 
start_date = '2012-01-01' #input("Enter start date (YYYY-MM-DD): ")
end_date = '2024-11-09' #input("Enter end date (YYYY-MM-DD): ")

df = pd.read_csv("PriceDate.csv")
df['Percent_Change'] = df['Price'].pct_change() * 100
days = int(input("Number of past days to match: "))
recent_percent_changes = df['Percent_Change'].tail(days).dropna().tolist()

closeness_rankings = []
tries = len(df) - days
for i in range(0, tries):
    past_percent_changes = df['Percent_Change'].iloc[i:i + days].tolist()
    deltas = [abs(recent - past) for recent, past in zip(recent_percent_changes, past_percent_changes)]
    closeness_ranking = sum(deltas)
    closeness_rankings.append((i, closeness_ranking, past_percent_changes))
    
closeness_rankings.sort(key=lambda x: x[1])
second_best_index, second_best_ranking, past_changes = closeness_rankings[1]
starting_date = df['Date'].iloc[second_best_index]
print(starting_date)
df.to_csv(f'PercentChange.csv', index=False)
next_percent_changes = df['Percent_Change'].iloc[second_best_index + days:second_best_index + days + days].tolist()
starting_price = float(df['Price'].iloc[-1])
predicted_prices = [starting_price]

for percent_change in next_percent_changes:
    new_price = predicted_prices[-1] * (1 + percent_change / 100)
    predicted_prices.append(new_price)

last_date = pd.to_datetime(df['Date'].iloc[-1])
predicted_dates = pd.date_range(start=last_date, periods=len(predicted_prices))
predicted_df = pd.DataFrame({'Date': predicted_dates,'Predicted_Price': predicted_prices})
predicted_df.to_csv('try_two_simulation.csv', index=False)

plt.figure(figsize=(12, 6))
def plot_data(plt, file, label, color):
    results_df = pd.read_csv(file)
    results_df.columns = results_df.columns.str.strip()  
    results_df['Date'] = pd.to_datetime(results_df['Date'])  
    plt.plot(results_df['Date'], results_df['Predicted_Price'], label=label, color=color)
    return results_df  

dfs = []
dfs.append(plot_data(plt, 'try_two_simulation.csv', 'Best Past Match', 'black')) 

all_dates = pd.concat([df['Date'] for df in dfs])  
plt.gca().set_xlim(left=all_dates.min(), right=all_dates.max())
plt.xlabel('Date')
plt.ylabel('Predicted Price')
plt.title(f'Predicted Prices for Best Past Matches over {days} days in the past')
plt.xticks(rotation=45)  
plt.grid()  
plt.tight_layout()
plt.legend()
plt.show()

Here's that pastebin link again with the historical volume weighted average prices. I've fixed the linked paste, but while looking for anything that could be going wrong, I found duplicate entries for 63554.119868161404,2024-09-23. So that should be fixed with any local copies.

https://pastebin.com/3C67XqKW

Code:

https://pastebin.com/3C67XqKW

I used a while loop to run multiple runs for every range of dates that have a close score average of less than 2. That means that for each pair of dates in the present and past ranges, the percent change from day to day differs by an average of less than 2% points. Like if the 2nd day in the past range was a -1% change, and the present range was a +1% change, that is a close score of 2 for that day.

I figured out we could test this model so I set the range to be from 2012-01-01 to 2024-10-11 and asked it for 30 days of matching(or close to) % change day to day going back from October 11th, to predict prices up to today.

.

.
.

.
I coulda called the spike to 80k!

Edit: So what about the next 30 days?

$100k here we come! Grin

d5000 (OP)

Legendary

Offline

Activity: 4508
Merit: 10106

Decentralization Maximalist

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

November 17, 2024, 11:11:18 PM
Last edit: November 17, 2024, 11:34:13 PM by d5000

Merited by DirtyKeyboard (1)

I think I found the problem, ~~but still am looking how to solve it~~. The problem is this line:

Code:

next_percent_changes = df['Percent_Change'].iloc[second_best_index + days:second_best_index + days + days].dropna().tolist()

If I chose 300 days and the second_best_index is too close to the present date, then second_best_index + 300 + 300 will return a number outside of the DataFrame, so the returned series will be shortened.

For example, I get the following dates for first three rankings:

- 2023-04-12
- 2023-04-25
- 2023-02-07

The limit of my original CSV file is 2024-10-26.

2023-04-12 + 600 days is: 2024-12-02
2023-04-25 + 600 days is: 2024-12-15
2023-02-07 + 600 days is: 2024-09-29

So only for the third rank the complete 300 days series will be taken, and I can confirm it from the graph.

The problem is thus, that the script searches for this timeframe in the current DataFrame, instead of appending more days to the DataFrame.

I think I found the fix. Simply delete the line I mentioned above (or comment it out) and replace all instances of the variable next_percent_changes with past_changes.

I get the following image in this case (expanded it to 7, because the first six are quite close but the seventh is from 2016):

.
.^{Duelbits PREDICT}..

.
.^{WHERE EVERYTHING IS A MARKET}..

█████
██
██

██
██
██████

Will Bitcoin hit $200,000
before January 1st 2027?
^No @1.15 ^Yes @6.00

█████
██
██

██
██
██████

^{CHECK MORE >}

d5000 (OP)

Legendary

Offline

Activity: 4508
Merit: 10106

Decentralization Maximalist

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

November 24, 2024, 02:46:26 AM
Last edit: November 24, 2024, 04:48:55 AM by d5000

Merited by DirtyKeyboard (1)

#10

I made actually a mistake in my last post. The script behaved correctly, however if you project data 300 days in the future for example, and the start date is less than 300 days before the last date in the CSV file, then of course it will project less than 300 days into the future

I've made some more improvements to the "similarity" script, but still one part of your last version is missing. The improvements include: first, you can select how many days you want to project into the future. This number in the previous script was always equal to the length of the timeframe you compared to the past changes. Second, you can choose the script to ignore values which are too close together, which was a problem in the last version where a lot of 2023 values were very close. A third improvement: the startdate of each similar timeframe is shown in the chart.

- takes arguments -d, --diverse and -n
- -d are the days to calculate (if you want the input method, just uncomment the commented line)
- --diverse ensures that the dates aren't too close together. The div_days variable (default: 60 days) defines the minimum distance between two of the predictions.
- -n is the length of the prediction. In the original script, the prediction is

The commented pyperclip part is for the "post" feature and can of course also be uncommented if needed.

Code:

# import pyperclip
import datetime, sys
import matplotlib.dates as mdates
import pandas as pd
import matplotlib.pyplot as plt
import pandas as pd

# constants
CSVFILE = "prices23nov24.csv"
start_date = '2012-01-01' #input("Enter start date (YYYY-MM-DD): ")
end_date = '2024-11-23' #input("Enter end date (YYYY-MM-DD): ")
diversity = True # diversity mode ensures the dates aren't too close to each other
div_days = 60
next_days = 60

def readarg(flag, default=None):
    if flag in sys.argv:
        if type(default) == bool:
            return True
        index = sys.argv.index(flag)
        if len(sys.argv) >= (index + 1):
            result = sys.argv[index + 1]
            if type(default) == int:
                return int(result)
            else:
                return result
    return default


def toocloseto(date_raw, ranks, days_raw):
    # for diversity mode: can't be too close
    date = datetime.date.fromisoformat(date_raw)
    days = datetime.timedelta(days_raw)
    # print("Comparing dates:", date_raw, ranks)
    for rkdate_raw in ranks:
        rkdate = datetime.date.fromisoformat(rkdate_raw)
        diff = abs(date - rkdate)
        if diff < days:
            # print("Too close:", diff)
            return True
    return False

# Function to plot data
def plot_data(plt, file, label, color):
    results_df = pd.read_csv(file)
    results_df.columns = results_df.columns.str.strip()  # Strip whitespace from column names
    results_df['Date'] = pd.to_datetime(results_df['Date'])  # Convert 'Date' to datetime
    plt.plot(results_df['Date'], results_df['Predicted_Price'], label=label, color=color)
    return results_df  # Return the DataFrame for later use

# ARGUMENTS
# days: length of the timeframe to compare, in days
days = readarg("-d", 90)
# diverse: ensure that the dates aren't too close one to another (default minimum difference: 60 days, see div_days constant above)
diversity = readarg("--diverse", False)
# next_days: length of the timeframe to add.
next_days = readarg("-n", 90)

# Read the CSV file into a DataFrame
df = pd.read_csv(CSVFILE)

# Filter the DataFrame based on the user-defined date range
df = df[(df['Date'] >= start_date) & (df['Date'] <= end_date)]

# Calculate the percent change between days
df['Percent_Change'] = df['Price'].pct_change() * 100

# Get user inputs
# days = int(input("Number of past days to match: "))

# Get the most recent sequence of percent changes
recent_percent_changes = df['Percent_Change'].tail(days).dropna().tolist()

# Initialize a list to store closeness rankings
closeness_rankings = []

# Iterate through the DataFrame to find matching sequences
for i in range(len(df) - days + 1):
    # Get the current sequence of percent changes
    past_percent_changes = df['Percent_Change'].iloc[i:i + days].dropna().tolist()
    past_dates = df['Date'].iloc[i:i + days].dropna().tolist()

    # Ensure we have a valid sequence
    if len(past_percent_changes) == days:

        # Calculate the deltas
        deltas = [abs(recent - past) for recent, past in zip(recent_percent_changes, past_percent_changes)]

        # Calculate the sum of deltas as the closeness ranking
        closeness_ranking = sum(deltas)
        # print(past_dates[0], past_dates[-1], closeness_ranking)

        # Store the result with the starting index
        closeness_rankings.append((i, closeness_ranking, past_percent_changes, past_dates[0], past_dates[-1]))

# Sort the closeness rankings by the ranking value
closeness_rankings.sort(key=lambda x: x[1])  # Sort by closeness ranking

n = 1 # n cycles through the ranks of the original dataframe.
m = 1 # m cycles through the actual ranks
post = []
colors = ['black', 'red', 'orange', 'blue', 'green', 'yellow', 'grey']
bestranks = []
for x in colors:
    if len(closeness_rankings) > 1:
        while True:
            second_best_index, second_best_ranking, past_changes = closeness_rankings[n][:3]
            starting_date = df['Date'].iloc[second_best_index]
            if diversity and toocloseto(starting_date, bestranks, div_days):
                # print(starting_date, "too close to the date of one of the selected best ranks")
                n += 1
                continue
            else:
                # Get the next percent changes starting from 'second_best_index + days'
                next_percent_changes = df['Percent_Change'].iloc[second_best_index + days:second_best_index + days + next_days].dropna().tolist()
                if len(next_percent_changes) < next_days:
                    print(f"Series too short: {starting_date}, {len(next_percent_changes)} days.")
                    n += 1
                    continue
                bestranks.append(starting_date)
                break
    else:
        print("Not enough closeness rankings found.")
        exit()

    print("Rank", n)
    close_score = (second_best_ranking/days)

    print(f'Closeness ({close_score}): {starting_date}')
    # series = f'Close score: ({close_score:.5f}) [color={x}]Range start date: {starting_date}[/color]'
    # post.append(series)

    # Set the starting price as the last price in the DataFrame
    starting_price = float(df['Price'].iloc[-1])

    # Initialize a list to store predicted prices
    predicted_prices = [starting_price]

    # Simulate the price changes using the next percent changes
    for percent_change in next_percent_changes:
        new_price = predicted_prices[-1] * (1 + percent_change / 100)
        predicted_prices.append(new_price)

    last_date = pd.to_datetime(df['Date'].iloc[-1])
    predicted_dates = pd.date_range(start=last_date, periods=len(predicted_prices), freq='D')

    # Prepare data for saving to CSV
    predicted_df = pd.DataFrame({
        'Date': predicted_dates,
        'Predicted_Price': predicted_prices
    })


    # Save the predicted prices to a CSV file
    predicted_df.to_csv(f'predicted_simulation{m}.csv', index=False)
    n += 1
    m += 1


# Test:
for i in range(7):
    print("Rank", i, bestranks[i])

plt.figure(figsize=(12, 6))

# Plot each predicted simulation and collect DataFrames
dfs = []
dfs.append(plot_data(plt, 'predicted_simulation1.csv', 'Best Past Match: start {}'.format(bestranks[0]), 'black'))
dfs.append(plot_data(plt, 'predicted_simulation2.csv', '2nd Best: start {}'.format(bestranks[1]), 'red'))
dfs.append(plot_data(plt, 'predicted_simulation3.csv', '3rd Best: start {}'.format(bestranks[2]), 'orange'))
dfs.append(plot_data(plt, 'predicted_simulation4.csv', '4th Best: start {}'.format(bestranks[3]), 'blue'))
dfs.append(plot_data(plt, 'predicted_simulation5.csv', '5th Best: start {}'.format(bestranks[4]), 'green'))
dfs.append(plot_data(plt, 'predicted_simulation6.csv', '6th Best: start {}'.format(bestranks[5]), 'yellow'))
dfs.append(plot_data(plt, 'predicted_simulation7.csv', '7th Best: start {}'.format(bestranks[6]), 'grey'))

# Set x-axis limits based on the min and max dates from all DataFrames
all_dates = pd.concat([df['Date'] for df in dfs])
plt.gca().set_xlim(left=all_dates.min(), right=all_dates.max())

# Add labels and title
plt.xlabel('Date')
plt.ylabel('Predicted Price')
plt.title(f'Predicted Prices for Best Past Matches over {days} days in the past (prediction length: {next_days} days)')
plt.xticks(rotation=45)
plt.grid()
plt.tight_layout()

# Show the plot
plt.legend()
plt.show()

# comment = f"{post}"
# print(comment)
# pyperclip.copy(comment)

Some images:

Best similarity matches for 300 days, projected 90 days into the future:

300 days, but with the --diversity argument (no dates closer than 60 days together):

Interesting that most are uber bullish, but there's always a bearish option.

Edited: There were different images and a slightly different script in an earlier version of this post. The script contained a bug but now it should work fine.

.
.^{Duelbits PREDICT}..

.
.^{WHERE EVERYTHING IS A MARKET}..

█████
██
██

██
██
██████

Will Bitcoin hit $200,000
before January 1st 2027?
^No @1.15 ^Yes @6.00

█████
██
██

██
██
██████

^{CHECK MORE >}

d5000 (OP)

Legendary

Offline

Activity: 4508
Merit: 10106

Decentralization Maximalist

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

March 01, 2025, 06:54:49 PM

Merited by DirtyKeyboard (1)

#11

Now that we've seen the first serious dip after last year's pump, I have again made some simulations.

The scripts are unchanged (compared to last year). I want to thank @DirtyKeyboard seems to still update the CSV data regularly!

1. Probability by similarity:

Here we see the most similar price movements recorded to the past, and how the price evolved after that -- projected to the current price. Interestingly, all similar dates are from 2016.

We see that in the past when similar movements were recorded, the price always seems to have recovered, and despite of the dip all variants close above 100.000$ in 90 days

2. Original approach:

Probability to close above 100.000 in the next 90 days (based on data from the last full cycle, i.e. 2018-22), taking 83000$ as base (yesterday near the average price): 60,5%

The individual simulations can be seen in these graphs:

The most bearish variant still predicts more than 33.000$, and the most bullish over 265.000$, with most simulations closing near or above 100.000$. [1]

Doesn't look that bad

[1] Interestingly, the maximum of the histogram is actually close to the current price (80-90k area). It seems though that the accumulated frequency of closing prices over 100.000 is much higher. While there are some slightly bearish to sideways scenarios, the probability of a real crash down to 50k or lower is seen as very low.

.
.^{Duelbits PREDICT}..

.
.^{WHERE EVERYTHING IS A MARKET}..

█████
██
██

██
██
██████

Will Bitcoin hit $200,000
before January 1st 2027?
^No @1.15 ^Yes @6.00

█████
██
██

██
██
██████

^{CHECK MORE >}

DirtyKeyboard

Hero Member

Offline

Activity: 868
Merit: 1378

Fly free sweet Mango.

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

March 03, 2025, 06:30:39 AM

#12

Quote from: d5000 on March 01, 2025, 06:54:49 PM

We see that in the past when similar movements were recorded, the price always seems to have recovered, and despite of the dip all variants close above 100.000$ in 90 days

The most bearish variant still predicts more than 33.000$, and the most bullish over 265.000$, with most simulations closing near or above 100.000$. [1]

Doesn't look that bad

Looking good! So hard to account for all the variables.

But, I was wondering on the #2 approach graphs, what the histogram would look like, if the title was "Histogram of Sims by Max Price per Run" instead of on the last day?

I ask because I was considering trying a script that wouldn't trade automatically, but might help one to set their buy/sell ladders.

d5000 (OP)

Legendary

Offline

Activity: 4508
Merit: 10106

Decentralization Maximalist

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

March 03, 2025, 10:55:16 PM
Last edit: March 04, 2025, 12:23:57 AM by d5000

#13

Quote from: DirtyKeyboard on March 03, 2025, 06:30:39 AM

But, I was wondering on the #2 approach graphs, what the histogram would look like, if the title was "Histogram of Sims by Max Price per Run" instead of on the last day?

The original script actually does this, only the histogram variant used the last price.

I changed the script now to include a --max option for the histogram, and fed it with the following data:

- Start price: 85000
- End price (to reach): 100000
- Days: 30
- Time interval: 2018 to 2021 (last full 4-year cycle)

The probability to be surpassed was 35.95 % for 10000 simulations.

The histograms obviously look very different: for the --max option, of course no price went to less than the start price (85000), as the first day is of course part of the row

Using the last price:

Using the max price:

~~Title of the histogram graph has still not been changed so it's currently misleading (in the second graph it should be "based on Max Price" instead of "by price at Last Day").~~ (Edit: This was fixed.) I'll make some more modifications to the script soon as I got some new ideas, for example the probability for the last price to reach the end price would also be interesting.

Code:

import sys, random, csv, datetime
from decimal import Decimal
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
"""Usage:

python price_probabilities.py STARTPRICE TARGET_PRICE DAYS [-n SIMULATIONS] [-s STARTDATE_SERIES] [-e ENDDATE_SERIES] [--debug] [--store] [--ext] [--graph] [--hist] [--max]

If the STARTPRICE is below the TARGET_PRICE, then it calculates the probability to surpass that price.
If the STARTPRICE is instead equal or above the TARGET_PRICE, then it calculates the probability to go below that price (useful to roughly estimate a crash risk)."""

# default values
CSV = True
CSVFILE = "prices.csv"
STARTDATE = "2018-01-01"
ENDDATE = "2021-12-31"
SIMULATIONS = 10000
DEBUG = False

# series from sept 22 to oct 23, 2024

SEP22OCT23 = [0.92, -1.13, 0.08, -2.40, 0.96, -0.07, 1.49, -0.30, 0.81, 1.51, 5.11, -0.53, 1.13, 3.62, -0.52, -2.46, -0.14, -0.91, 1.22, -0.03, 2.19, 0.17, -0.31, -3.95, -3.46, -0.39, 0.14, 0.92, 3.20, -1.71, 1.46, -0.38]

def csv_to_series(csvfilename, startdatestr, enddatestr, debug=False):
    startdate = datetime.date.fromisoformat(startdatestr)
    enddate = datetime.date.fromisoformat(enddatestr)
    with open(csvfilename, "r") as csvfile:
        reader = csv.DictReader(csvfile)
        prices = [r for r in reader]
    change_series = []
    prev_price = None
    for day_data in prices:
        date = datetime.date.fromisoformat(day_data["Date"])
        if date < startdate:
            continue
        if date > enddate:
            break

        price = Decimal(str(day_data["Price"]))
        if prev_price is None:
            if debug:
                print(price)
            prev_price = price
            continue
        change = Decimal(price / prev_price)
        change_series.append(change)
        prev_price = price
        if debug:
            print(price, change)
    if debug:
        print(change_series)
    return change_series

def store_to_csv(data, csvfilename, extended=False):
    with open(csvfilename, "w") as csvfile:
        writer = csv.writer(csvfile)
        if extended:
            writer.writerow(['Simulation', 'Day', 'Price'])
            for sim, row in enumerate(data):
                for day, result in enumerate(row):
                    writer.writerow([sim + 1, day + 1, round(float(result), 2)])
        else:
            writer.writerow(["Day {}".format(d + 1) for d in range(days)])
            for row in data:
                writer.writerow([round(float(day), 2) for day in row])

def create_dataframe(data):

    startprice_sim = round(float(start_price), 2)
    data_list = []
    for sim, days in enumerate(data):
        data_list.append({"Simulation" : sim, "Date" : startdate_sim, "Price" : startprice_sim})
        for day, price in enumerate(days):
            date = startdate_sim + datetime.timedelta(days=day + 1)
            data_list.append({"Simulation" : sim, "Date" : date, "Price" : price})

    return pd.DataFrame(data_list)


def store_to_graph(data, graphfilename, show=False):
    # Create a figure and axis for the plot
    plt.figure(figsize=(12, 6))
    dataframe = create_dataframe(data)

    for sim in dataframe["Simulation"].unique():
        sim_data = dataframe[dataframe['Simulation'] == sim]
        plt.plot(sim_data['Date'], sim_data['Price'], marker='o') # Group by simulation number

    max_date, min_date = dataframe["Date"].max(), dataframe["Date"].min()

    # Add labels and title
    plt.xlabel('Date')
    plt.ylabel('Price')
    plt.title('Price vs Date for Each Simulation')
    plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
    plt.xticks(rotation=45)  # Rotate x-axis labels for better readability
    plt.grid()  # Add grid for better readability
    plt.tight_layout()
    plt.gca().set_xlim(left=min_date, right=max_date)
    plt.subplots_adjust(left=0.1, right=0.9, top=0.9, bottom=0.15)

    # Mirror the y-axis on the right
    ax = plt.gca()
    ax2 = ax.twinx()
    ax2.set_ylabel('Price')
    ax2.yaxis.set_label_position('right')
    ax2.yaxis.tick_right()
    ax2.set_ylim(ax.get_ylim())

    # Export the plot
    if show:
        plt.show()
    else:
        plt.savefig(graphfilename)

def store_to_histogram(data, graphfilename, show=False, usemax=False):

    # Create a figure and axis for the plot
    plt.figure(figsize=(12, 6))
    dataframe = create_dataframe(data)
    # results_df = pd.read_csv('C:/PyProjects/Predictor/simulation_results.csv')

    # Extract the last price for each simulation
    # modif: usemax uses the maximum price reached, not the last price.
    if usemax:
        prices = dataframe.groupby('Simulation')['Price'].max()
    else:
        prices = dataframe.groupby('Simulation')['Price'].last()

    # Create bins for the price ranges
    bins = range(int(prices.min()), int(prices.max()) + 1000, 1000)

    # Create a histogram of the last / max prices
    plt.figure(figsize=(12, 6))
    counts, _, patches = plt.hist(prices, bins=bins, edgecolor='black')

    # Calculate min, max, and mode
    min_price = prices.min()
    max_price = prices.max()
    mode_price = prices.mode()[0]  # Get the first mode if there are multiple

    # Label the columns with min, max, and mode prices
    plt.text(min_price, counts.max() * 0.9, f'Min: {min_price:.0f}', color='blue', fontsize=13)
    plt.text(max_price - len(bins)*150, counts.max() * 0.9, f'Max: {max_price:.0f}', color='green', fontsize=13)
    # plt.text(mode_price, counts.max() * 0.8, f'Mode: {mode_price}', color='green', fontsize=10)

    # Set labels and title
    if usemax:
        x_label = 'Maximum Price of Each Simulation'
        hist_title = 'Histogram of Simulations by Max Price'
    else:
        x_label = 'Price on Last Day of Each Simulation'
        hist_title = 'Histogram of Simulations by Price on Last Day'
    plt.xlabel(x_label)
    plt.ylabel('Number of Simulations')
    plt.title(hist_title)

    # Show the plot
    if show:
        plt.show()
    else:
        plt.savefig(graphfilename)

def readarg(flag, default=None):
    if flag in sys.argv:
        index = sys.argv.index(flag)
        if type(default) == bool:
            return True
        if len(sys.argv) >= (index + 1):
            result = sys.argv[index + 1]
            if type(default) == int:
                return int(result)
            else:
                return result
    return default

# main script

start_price = Decimal(str(sys.argv[1]))
end_price = Decimal(str(sys.argv[2]))
days = int(sys.argv[3])
simulations = readarg("-n", 10000)
start_date = readarg("-s", STARTDATE)
end_date = readarg("-e", ENDDATE)
debug = readarg("--debug", DEBUG)
store = readarg("--store", False)
extended = readarg("--ext", False)
graph = readarg("--graph", False)
histogram = readarg("--hist", False)
usemax = readarg("--max", False)
show = readarg("--show", False) # Shows the graph instead of storing it. Both doesn't seem to work :(
startdate_sim_str = readarg("-d", None) # You can set an alternative start date than today, mostly for the graphs.
sd = datetime.date.fromisoformat(startdate_sim_str) if startdate_sim_str else datetime.date.today()
startdate_sim = datetime.datetime(sd.year, sd.month, sd.day)

base_price_series = SEP22OCT23

if CSV:
    price_series = csv_to_series(CSVFILE, start_date, end_date, debug)
else:
    price_series = [Decimal((100 + i)/100) for i in base_price_series]

passed_sims = 0
all_sims = []

# Take into account that this loop does not add the start price to the sims.
for sim in range(simulations):

    price = start_price
    sim_days = []
    for day in range(days):
        price_change = random.choice(price_series)
        price = price * price_change
        sim_days.append(price)
    if store or graph or histogram:
        all_sims.append(sim_days)

    if debug:
        print("Simulation {}: {}".format(sim, sim_days))
    for dprice in sim_days:
        if ((end_price > start_price and dprice >= end_price) or
            (start_price >= end_price and dprice <= end_price)):
            passed_sims += 1
            break

probability = Decimal(passed_sims * 100 / simulations)

print("Price: {} in {} simulations".format(end_price, simulations))
print("Reached in {} simulations".format(passed_sims))
print("Probability: {} %".format(float(probability)))

if store:
    csv_filename = "{}-{}-{}-{}-{}-{}sims.csv".format(start_price, end_price, days, start_date, end_date, simulations)
    store_to_csv(all_sims, csv_filename, extended=extended)

if graph:
    graph_filename = "{}-{}-{}-{}-{}-{}sims.png".format(start_price, end_price, days, start_date, end_date, simulations)
    store_to_graph(all_sims, graph_filename, show=show)

if histogram:
    if usemax:
        histogram_filename = "{}-{}-{}-{}-{}-{}sims-maxprices-histogram.png".format(start_price, end_price, days, start_date, end_date, simulations)
    else:
        histogram_filename = "{}-{}-{}-{}-{}-{}sims-lastprices-histogram.png".format(start_price, end_price, days, start_date, end_date, simulations)
    store_to_histogram(all_sims, histogram_filename, show=show, usemax=usemax)

.
.^{Duelbits PREDICT}..

.
.^{WHERE EVERYTHING IS A MARKET}..

█████
██
██

██
██
██████

Will Bitcoin hit $200,000
before January 1st 2027?
^No @1.15 ^Yes @6.00

█████
██
██

██
██
██████

^{CHECK MORE >}

buwaytress

Legendary

Offline

Activity: 3402
Merit: 4093

I bit therefore I am

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

March 05, 2025, 02:14:11 PM

#14

A little over my head, as things tend to do more often these days but nevertheless, cool beans. Methinks someone should run the numbers through a bot to try and see actual results. Maybe even on a weaker currency (ringgit, baht maybe?) to rack up them zeroes.

A few years ago with more time on my hand I would just feed the numbers into the Bitcoin Up or Down games casinos now have (oh except the edge is so high so boo). That would be the simplest test of this no?

▄▄███▄▄
▄▄███████████▄▄
▄██████████████████▄
▄█████▀▀▀█████▀▀▀█████▄
████▌░░░░░░░░░░░░░▐████
████▌░░░░░░░░░░░░░▐████
████▀░░░▄▄░░░▄▄░░░▀████
████░░░██▀░░░▀██░░░████
████▄░░░░░▀█▀░░░░░▄████
▀█████▄▄▄▄▄▄▄▄▄▄▄█████▀
▀██████████████████▀
▀▀███████████▀▀
▀▀███▀▀

.
betpanda.io

│

ANONYMOUS & INSTANT
.......ONLINE CASINO.......

│

▄███████████████████████▄
█████████████████████████
█████████████████████████
████████▀▀▀▀▀▀███████████
████▀▀▀█░▀▀░░░░░░▄███████
████░▄▄█▄▄▀█▄░░░█▄░▄█████
████▀██▀░▄█▀░░░█▀░░██████
██████░░▄▀░░░░▐░░░▐█▄████
██████▄▄█░▀▀░░░█▄▄▄██████
█████████████████████████
█████████████████████████
█████████████████████████
▀███████████████████████▀

▄███████████████████████▄
█████████████████████████
██████████▀░░░▀██████████
█████████░░░░░░░█████████
████████░░░░░░░░░████████
████████░░░░░░░░░████████
█████████▄░░░░░▄█████████
███████▀▀▀█▄▄▄█▀▀▀███████
██████░░░░▄░▄░▄░░░░██████
██████░░░░█▀█▀█░░░░██████
██████░░░░░░░░░░░░░██████
█████████████████████████
▀███████████████████████▀

▄███████████████████████▄
█████████████████████████
██████████▀▀▀▀▀▀█████████
███████▀▀░░░░░░░░░███████
██████▀░░░░░░░░░░░░▀█████
██████░░░░░░░░░░░░░░▀████
██████▄░░░░░░▄▄░░░░░░████
████▀▀▀▀▀░░░█░░█░░░░░████
████░▀░▀░░░░░▀▀░░░░░█████
████░▀░▀▄░░░░░░▄▄▄▄██████
█████░▀░█████████████████
█████████████████████████
▀███████████████████████▀

SLOT GAMES
....SPORTS....
LIVE CASINO

│

▄░░▄█▄░░▄
▀█▀░▄▀▄░▀█▀
▄▄▄▄▄▄▄▄▄▄▄
█████████████
█░░░░░░░░░░░█
█████████████
▄▀▄██▀▄▄▄▄▄███▄▀▄
▄▀▄██▄███▄█▄██▄▀▄
▄▀▄█▐▐▌███▐▐▌█▄▀▄
▄▀▄██▀█████▀██▄▀▄
▄▀▄█████▀▄████▄▀▄
▀▄▀▄▀█████▀▄▀▄▀
▀▀▀▄█▀█▄▀▄▀▀

Regional Sponsor of the
Argentina National Team

d5000 (OP)

Legendary

Offline

Activity: 4508
Merit: 10106

Decentralization Maximalist

Re: [Script] Calculate the probability to reach a price, e.g. 100k :)

May 21, 2025, 03:26:07 AM
Last edit: May 21, 2025, 03:37:16 AM by d5000

#15

I have fed the scripts with new data.

First the question: Will there be a new ATH in the next 14 days, starting from 106,000$? The script returns a 66% probability that yes, i.e. that it will reach 110,000 (and we have already seen 107k).

Second question: Will the price reach $130,000 in 90 days? This looks like a reasonable next step. Script says: 56,73% probability with 10,000 simulations.

Histogram for the last price:

Looks very slightly bearish at a first glance because the maximum of the last price is about 100k. But this would mean that it's possible that a new ATH is achieved before.

Histogram for the max price:

As I think I already mentioned, the lowest bar is the highest because the max price can't be below the start price, so this bar shows all completely bearish simulations (those never reaching more than the starting price), which are only ~600 from 10000.

Graphs for 100 simulations:

I also am currently experimenting with additional parameters to adjust volatility. As we're experiencing a declining volatility over the last years, it's possible that the data of the simulation exaggerates the possible volatility, so a knob to adjust it imo makes sense.

Oh, I totally oversaw this answer:

Quote from: buwaytress on March 05, 2025, 02:14:11 PM

Methinks someone should run the numbers through a bot to try and see actual results. Maybe even on a weaker currency (ringgit, baht maybe?) to rack up them zeroes.

A few years ago with more time on my hand I would just feed the numbers into the Bitcoin Up or Down games casinos now have (oh except the edge is so high so boo). That would be the simplest test of this no?

I guess that would be an interesting experiment. Simply betting to "up or down" however would lose information, I think. You could however of course allocate long and short positions and closing them according to the simulations' outcome. For example, if there are 60% bullish outcomes, buy 60 units of BTC (long) and at the same price short-sell 40 units (short), setting the sell / re-buy price at the maximum or minimum price the simulation gave.

Example with 2 simulations (1 bullish and 1 bearish one):

106000, 107000, 108000, 105000, 106000, 112000, 114000, 107000 => buy at 106000 and sell at 114000 if this price is reached
106000, 107000, 103000, 99000, 95000, 99000, 102000 => short-sell at 106000, re-buy at 95000

.
.^{Duelbits PREDICT}..

.
.^{WHERE EVERYTHING IS A MARKET}..

█████
██
██

██
██
██████

Will Bitcoin hit $200,000
before January 1st 2027?
^No @1.15 ^Yes @6.00

█████
██
██

██
██
██████

^{CHECK MORE >}

Pages: [1]

Bitcoin Forum > Economy > Economics > Speculation > [Script] Calculate the probability to reach a price, e.g. 100k :)

« previous topic next topic »

Jump to: