d5000 (OP)
Legendary
Offline
Activity: 4508
Merit: 10106
Decentralization Maximalist
|
 |
November 02, 2024, 07:44:24 PM Last edit: November 08, 2024, 06:45:04 AM by d5000 |
|
Do you want to know how likely it is for Bitcoin to reach 100.000 USD before 2025, or 200.000 before 2026? I have written a Python script which lets you calculate the probability based on previous price movements. For example, you can take a timeframe from the past where you suspect that the market was in a similar situation than currently. Many for example think we're in a mid-bull market, thus interesting years were 2016/17 or 2020/21. How does the script work? It takes the percentual daily price changes in the time frame you select, and assigns to each the same probability. For example, if you use 2016/17, then it takes the daily change from 2016-01-01 to 2016-01-02 until 2017-12-30 to 2017-12-31, i.e. 730 days (2 x 365). Each of these changes has exactly a probability of 1/730. You now tell the script how many days you want to simulate. For example, until January 1, 2025 we have 60 days (as of November 2). And then you tell it also the current Bitcoin price, e.g. $69500. The script now applies one of the price changes from the data series, and repeats that as many times as the number of days you specified. If the target price is reached, then the script will increase a counter. Then it will tell you how many times the target price has been reached, compared to the total number of simulations. The idea came from this thread, unfortunately it was closed while the discussion about the script was still ongoing. So I open a thread exclusively for that script. (Interesting posts in the linked thread are this one, this one and above all this one by user @DirtyKeyboard, which includes an alternative script which works very similar). Interesting input came also from @Tubartuluk in the German subforum. So it's better to use a longer series of data. This is of course not a 100% scientific approach. It's basically a simplified Monte Carlo simulation. But it should be done just for fun, and not for investment decisions!
Here's the script. Save it as price-probabilities.py together with a CSV file with the price series. DirtyKeyboard has provided this CSV with Bitstamp data since 2012 in the other thread. If you don't know Python, simply save it as "prices.csv" in the same folder than the script and it will work if the first column in the CSV file is called "Price" and the second "Date" and contains the price and date. Call the script with (Python 3.8+ required): python price-probabilities.py STARTPRICE TARGET_PRICE DAYS [SIMULATIONS]STARTPRICE: the current price or another price you want to start the simulation at. TARGET_PRICE: the price you want to know the probability (e.g. 100000) DAYS: number of days to simulate. For example, if you want to know the probability in the next year, use 365. SIMULATIONS is optional and is the number of simulations (default: 10000, the higher the better). In the script you have to specify the date range for the base series. Currently it includes the years 2020 and 2021, i.e. the latter 2 years of the previous bull market. You manipulate this with the STARTDATE and ENDDATE variables in the script itself (didn't want to add too many command line parameters). Edit: An expanded script including graph and histogram is available in this postimport sys, random, csv, datetime from decimal import Decimal
# usage: python price_probabilities.py STARTPRICE ENDPRICE DAYS [SIMULATIONS] # STARTPRICE is the price you use as a base, e.g. the current BTC/USD price # ENDPRICE is the price you want to know the probability of being reached
CSV = True # mark as False to use the example series below CSVFILE = "prices.csv" # replace with file name with csv data STARTDATE = "2020-01-01" # start date of the series ENDDATE = "2021-12-31" # end date of the series DEBUG = False
# example series from sept 22 to oct 23, 2024
SEP22OCT23 = [0.92, -1.13, 0.08, -2.40, 0.96, -0.07, 1.49, -0.30, 0.81, 1.51, 5.11, -0.53, 1.13, 3.62, -0.52, -2.46, -0.14, -0.91, 1.22, -0.03, 2.19, 0.17, -0.31, -3.95, -3.46, -0.39, 0.14, 0.92, 3.20, -1.71, 1.46, -0.38]
def csv_to_series(csvfilename, startdatestr, enddatestr, debug=False): startdate = datetime.date.fromisoformat(startdatestr) enddate = datetime.date.fromisoformat(enddatestr) with open(csvfilename, "r") as csvfile: reader = csv.DictReader(csvfile) prices = [r for r in reader] change_series = [] prev_price = None for day_data in prices: date = datetime.date.fromisoformat(day_data["Date"]) if date < startdate: continue if date > enddate: break
price = Decimal(str(day_data["Price"])) if prev_price is None: if debug: print(price) prev_price = price continue change = Decimal(price / prev_price) change_series.append(change) prev_price = price if debug: print(price, change) if debug: print(change_series) return change_series
# price probability
start_price = Decimal(str(sys.argv[1])) end_price = Decimal(str(sys.argv[2])) days = int(sys.argv[3]) if len(sys.argv) > 4: simulations = int(sys.argv[4]) else: simulations = 10000
base_price_series = SEP22OCT23
if CSV: price_series = csv_to_series(CSVFILE, STARTDATE, ENDDATE, DEBUG) else: price_series = [Decimal((100 + i)/100) for i in base_price_series]
passed_sims = 0 for sim in range(simulations):
price = start_price sim_days = [] for day in range(days): price_change = random.choice(price_series) price = price * price_change sim_days.append(price)
if DEBUG: print("Simulation {}: {}".format(sim, sim_days)) for dprice in sim_days: if dprice >= end_price: passed_sims += 1 break
probability = Decimal(passed_sims * 100 / simulations)
print("Price: {} in {} simulations".format(end_price, simulations)) print("Reached in {} simulations".format(passed_sims)) print("Probability: {} %".format(float(probability)))
(Edited: removed unnecessary print statements from the script)
|
|
|
|
|
DirtyKeyboard
|
Nice. I'm working on a little script to keep that pastebin file updated, and I added some graphing. I figured I would run till the end of this year, using the end of last year's data. Enter start date (YYYY-MM-DD): 2023-11-03 Enter end date (YYYY-MM-DD): 2023-12-31 Number of days to simulate: 58 Starting price: 68442 Number of simulations: 100  Interesting spread. It ends in profit, but that bottom blue series doesn't look like fun.
|
██████████████████████████ ▟█ █████████████████████████████████████████████████████████████████ █████████████████████████ ▟██ ████████████████████████████████████████████████████████████████ ███████████████████████ ⚞▇▇▋███▎▎(⚬)> ███████████████████████████████████████████████████████████████ █████████████████████████ ▜██ ████████████████████████████████████████████████████████████████ ██████████████████████████ ▜█ █████████████████████████████████████████████████████████████████ ███████████████████████████ ███████████████████████████████████████████████████████████████████
|
|
|
d5000 (OP)
Legendary
Offline
Activity: 4508
Merit: 10106
Decentralization Maximalist
|
 |
November 07, 2024, 05:25:59 AM |
|
I've tried your script with the graph plotting now and it works awesome  I've adapted it a bit so it can be used with command line parameters, like my original script. Is it okay for you if I publish it here? (Otherwise I'll adapt my own script adding the CSV saving and plotting parts.) Now as we already saw a new all time high I fed the script with a $75000 starting price. Some scenarios based on the whole last bull market (2019-01-01 to 2021-12-31) [1]: - 35% for 100000 until the end of the year - 60% for 100000 in 100 days - 93% for 100000 in 365 days  - 53% for 200000 in 365 days  Unfortunately talkimg.com doesn't work for me at the moment, so I couldn't upload the pics  I've also modified the script so it also can calculate the probability of crashes (i.e. the price falling below a certain threshold). This could be used to do some hedging. For an example run, I used the same data and calculated the probability to crash to 50.000 until the end of the year. The probability was only of 3,1% which looks fine  Even 60000 is not really likely, with 16%. [1] I think the timeframe you chose in your last post is too short, as it is only a single small leg of a bull market, and also an extremely bullish one. As we don't know in which stage of the bull market we are, I thus chose the last entire bull market, including the crashes near the start and near the end.
|
|
|
|
|
DirtyKeyboard
|
 |
November 07, 2024, 05:57:12 AM Last edit: November 07, 2024, 06:25:39 AM by DirtyKeyboard |
|
I know, I know, you're all saying, "Why didn't you use the end of the year from 4 years ago?" Good question. I think. Anyway, this is all for fun speculation. Enter start date (YYYY-MM-DD): 2020-11-07 Enter end date (YYYY-MM-DD): 2020-12-31 Number of days to simulate: 54 Starting price: 74057 Number of simulations: 10000 Number of simulations reaching $100,000 or more: 9592 Probability of reaching $100,000 or more: 95.92% Wow! Who got in early? Like, as in yesterday early?  Here is the AI teamwork makes the LLM dream work script. I'm more of a terminal input person, but mix and match as you like. I added a histogram too. I think the auto format would work with less spread in the results. With the small time frame, I was just trying to have fun matching the days 1:1 from datetime import datetime, timedelta import csv, random import matplotlib.dates as mdates import pandas as pd import matplotlib.pyplot as plt
# Get user inputs for date range start_date = input("Enter start date (YYYY-MM-DD): ") end_date = input("Enter end date (YYYY-MM-DD): ")
# Read the CSV file into a DataFrame
df = pd.read_csv("G:/Other computers/My Laptop/PyProjects/Top20/Top20_Total_03_24/VWAP_USD/result_with_timestamp.csv")
# Filter the DataFrame based on the user-defined date range df = df[(df['Column 2 Name'] >= start_date) & (df['Column 2 Name'] <= end_date)]
# Calculate the percent change between days df['Percent_Change'] = df['Column 1 Name'].pct_change() * 100
# Create a list of percent changes, dropping the first NaN value percent_changes = df['Percent_Change'].dropna().tolist()
# Print the first few percent changes to verify print(percent_changes[:5])
# Get user inputs days = int(input("Number of days to simulate: ")) starting_price = float(input("Starting price: ")) number_of_simulations = int(input("Number of simulations: ")) wins = 0
# Initialize a list to store daily prices for averaging daily_prices = [[] for _ in range(days)]
# Open the CSV file for writing results with open('C:/PyProjects/Predictor/simulation_results.csv', mode='w', newline='') as file: writer = csv.writer(file) writer.writerow(['Simulation', 'Date', 'Price']) # Write header for sim in range(number_of_simulations): # Initialize the price list with the starting price for each simulation prices = [starting_price] reached_100k = False
# Initialize the starting date current_date = datetime.now() # or set to a specific start date writer.writerow([sim + 1, current_date.strftime('%Y-%m-%d'), starting_price]) # Simulate price changes for day in range(days): percent_change = random.choice(percent_changes) new_price = prices[-1] * (1 + percent_change / 100) prices.append(new_price)
current_date += timedelta(days=1) # Store the simulation number, date, and new price writer.writerow([sim + 1, current_date.strftime('%Y-%m-%d'), new_price])
# Store the new price for averaging daily_prices[day].append(new_price)
# Check if price has exceeded 100,000 at any point if new_price > 100_000 and not reached_100k: reached_100k = True wins += 1
# Print the results for the simulation print(f"\nSimulation {sim + 1}:") print(f"Final price after {days} days: ${prices[-1]:.2f}") print(f"Total change: {((prices[-1] - starting_price) / starting_price * 100):.2f}%") print(f"Reached $100,000: {'Yes' if reached_100k else 'No'}")
print(f"\nNumber of simulations reaching $100,000 or more: {wins}") print(f"Probability of reaching $100,000 or more: {(wins / number_of_simulations) * 100:.2f}%")
# Create a figure and axis for the plot plt.figure(figsize=(12, 6)) results_df = pd.read_csv('C:/PyProjects/Predictor/simulation_results.csv') results_df.columns = results_df.columns.str.strip() # Remove any leading/trailing whitespace results_df['Date'] = pd.to_datetime(results_df['Date'])
# Loop through each unique simulation number for sim in results_df['Simulation'].unique(): sim_data = results_df[results_df['Simulation'] == sim] plt.plot(sim_data['Date'], sim_data['Price'], marker='o') # Group by simulation number
# Add labels and title plt.xlabel('Date') plt.ylabel('Price') plt.title('Price vs Date for Each Simulation') plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m')) plt.xticks(rotation=45) # Rotate x-axis labels for better readability plt.grid() # Add grid for better readability plt.tight_layout() plt.gca().set_xlim(left=results_df['Date'].min(), right=results_df['Date'].max()) plt.subplots_adjust(left=0.1, right=0.9, top=0.9, bottom=0.15)
# Mirror the y-axis on the right ax = plt.gca() ax2 = ax.twinx() ax2.set_ylabel('Price') ax2.yaxis.set_label_position('right') ax2.yaxis.tick_right() ax2.set_ylim(ax.get_ylim())
# Export the plot plt.show()
# Uncomment 2 lines below and replace with actual path # file_name = f"{current_date:%m}-{current_date:%d}-{current_date:%H}-{current_date:%M}-{current_date:%S}.png" # plt.savefig(f'C:/PyProjects/Predictor/{file_name}')
# Create a figure and axis for the plot plt.figure(figsize=(12, 6)) results_df = pd.read_csv('C:/PyProjects/Predictor/simulation_results.csv')
# Extract the last price for each simulation last_prices = results_df.groupby('Simulation')['Price'].last()
# Create bins for the price ranges bins = range(int(last_prices.min()), int(last_prices.max()) + 1000, 1000)
# Create a histogram of the last prices plt.figure(figsize=(12, 6)) counts, _, patches = plt.hist(last_prices, bins=bins, edgecolor='black')
# Calculate min, max, and mode min_price = last_prices.min() max_price = last_prices.max() mode_price = last_prices.mode()[0] # Get the first mode if there are multiple
# Label the columns with min, max, and mode prices plt.text(min_price, counts.max() * 0.9, f'Min: {min_price:.0f}', color='blue', fontsize=13) plt.text(max_price - len(bins)*150, counts.max() * 0.9, f'Max: {max_price:.0f}', color='green', fontsize=13) # plt.text(mode_price, counts.max() * 0.8, f'Mode: {mode_price}', color='green', fontsize=10)
# Set labels and title plt.xlabel('Price on Last Day of Each Simulation') plt.ylabel('Number of Simulations') plt.title('Histogram of Simulations by Price on Last Day')
# Show the plot plt.show()
# Uncomment 2 lines below and replace with actual path # file_name = f"{current_date:%m}-{current_date:%d}-{current_date:%H}-{current_date:%M}-{current_date:%S}_2C.png" # plt.savefig(f'C:/PyProjects/Predictor/{file_name}') EDITED: Thanks and if the pictures loaded, I heard imgBB is working. I might have also fixed the auto scaling.
|
██████████████████████████ ▟█ █████████████████████████████████████████████████████████████████ █████████████████████████ ▟██ ████████████████████████████████████████████████████████████████ ███████████████████████ ⚞▇▇▋███▎▎(⚬)> ███████████████████████████████████████████████████████████████ █████████████████████████ ▜██ ████████████████████████████████████████████████████████████████ ██████████████████████████ ▜█ █████████████████████████████████████████████████████████████████ ███████████████████████████ ███████████████████████████████████████████████████████████████████
|
|
|
d5000 (OP)
Legendary
Offline
Activity: 4508
Merit: 10106
Decentralization Maximalist
|
Thanks for the histogram function, looks awesome  I have now expanded the script with some command line options, the graph and histogram functions and the ability to save the file to CSV. The CSV format of my script is however simpler than the one in yours: It stores each simulation as a separate row, this looks more spreadsheet-friendly to me, if someone wants to import the data into Excel, Calc etc.. Two known little problems, I hope to solve them soon: - histogram function creates two figures instead of one - if the graph is shown, then it is not stored correctly (it seems the plot is blanked out). So I added a "--show" option, currently you can only either store the graph or show it. See the Usage section in the script for the command line options.
import sys, random, csv, datetime from decimal import Decimal import pandas as pd import matplotlib.pyplot as plt import matplotlib.dates as mdates """Usage:
python price_probabilities.py STARTPRICE TARGET_PRICE DAYS [-n SIMULATIONS] [-s STARTDATE_SERIES] [-e ENDDATE_SERIES] [--debug] [--store] [--ext] [--graph] [--hist] [--show]
If the STARTPRICE is below the TARGET_PRICE, then it calculates the probability to surpass that price. If the STARTPRICE is instead equal or above the TARGET_PRICE, then it calculates the probability to go below that price (useful to roughly estimate a crash risk)."""
# default values CSV = True CSVFILE = "prices.csv" STARTDATE = "2019-01-01" ENDDATE = "2021-12-31" SIMULATIONS = 10000 DEBUG = False
# series from sept 22 to oct 23, 2024
SEP22OCT23 = [0.92, -1.13, 0.08, -2.40, 0.96, -0.07, 1.49, -0.30, 0.81, 1.51, 5.11, -0.53, 1.13, 3.62, -0.52, -2.46, -0.14, -0.91, 1.22, -0.03, 2.19, 0.17, -0.31, -3.95, -3.46, -0.39, 0.14, 0.92, 3.20, -1.71, 1.46, -0.38]
def csv_to_series(csvfilename, startdatestr, enddatestr, debug=False): startdate = datetime.date.fromisoformat(startdatestr) enddate = datetime.date.fromisoformat(enddatestr) with open(csvfilename, "r") as csvfile: reader = csv.DictReader(csvfile) prices = [r for r in reader] change_series = [] prev_price = None for day_data in prices: date = datetime.date.fromisoformat(day_data["Date"]) if date < startdate: continue if date > enddate: break
price = Decimal(str(day_data["Price"])) if prev_price is None: if debug: print(price) prev_price = price continue change = Decimal(price / prev_price) change_series.append(change) prev_price = price if debug: print(price, change) if debug: print(change_series) return change_series
def store_to_csv(data, csvfilename, extended=False): with open(csvfilename, "w") as csvfile: writer = csv.writer(csvfile) if extended: writer.writerow(['Simulation', 'Day', 'Price']) for sim, row in enumerate(data): for day, result in enumerate(row): writer.writerow([sim + 1, day + 1, round(float(result), 2)]) else: writer.writerow(["Day {}".format(d + 1) for d in range(days)]) for row in data: writer.writerow([round(float(day), 2) for day in row])
def create_dataframe(data):
startprice_sim = round(float(start_price), 2) data_list = [] for sim, days in enumerate(data): data_list.append({"Simulation" : sim, "Date" : startdate_sim, "Price" : startprice_sim}) for day, price in enumerate(days): date = startdate_sim + datetime.timedelta(days=day + 1) data_list.append({"Simulation" : sim, "Date" : date, "Price" : price})
return pd.DataFrame(data_list)
def store_to_graph(data, graphfilename, show=False): # Create a figure and axis for the plot plt.figure(figsize=(12, 6)) dataframe = create_dataframe(data)
for sim in dataframe["Simulation"].unique(): sim_data = dataframe[dataframe['Simulation'] == sim] plt.plot(sim_data['Date'], sim_data['Price'], marker='o') # Group by simulation number
max_date, min_date = dataframe["Date"].max(), dataframe["Date"].min()
# Add labels and title plt.xlabel('Date') plt.ylabel('Price') plt.title('Price vs Date for Each Simulation') plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m')) plt.xticks(rotation=45) # Rotate x-axis labels for better readability plt.grid() # Add grid for better readability plt.tight_layout() plt.gca().set_xlim(left=min_date, right=max_date) plt.subplots_adjust(left=0.1, right=0.9, top=0.9, bottom=0.15)
# Mirror the y-axis on the right ax = plt.gca() ax2 = ax.twinx() ax2.set_ylabel('Price') ax2.yaxis.set_label_position('right') ax2.yaxis.tick_right() ax2.set_ylim(ax.get_ylim())
# Export the plot if show: plt.show() else: plt.savefig(graphfilename)
def store_to_histogram(data, graphfilename, show=False):
# Create a figure and axis for the plot plt.figure(figsize=(12, 6)) dataframe = create_dataframe(data) # results_df = pd.read_csv('C:/PyProjects/Predictor/simulation_results.csv')
# Extract the last price for each simulation last_prices = dataframe.groupby('Simulation')['Price'].last()
# Create bins for the price ranges bins = range(int(last_prices.min()), int(last_prices.max()) + 1000, 1000)
# Create a histogram of the last prices plt.figure(figsize=(12, 6)) counts, _, patches = plt.hist(last_prices, bins=bins, edgecolor='black')
# Calculate min, max, and mode min_price = last_prices.min() max_price = last_prices.max() mode_price = last_prices.mode()[0] # Get the first mode if there are multiple
# Label the columns with min, max, and mode prices plt.text(min_price, counts.max() * 0.9, f'Min: {min_price:.0f}', color='blue', fontsize=13) plt.text(max_price - len(bins)*150, counts.max() * 0.9, f'Max: {max_price:.0f}', color='green', fontsize=13) # plt.text(mode_price, counts.max() * 0.8, f'Mode: {mode_price}', color='green', fontsize=10)
# Set labels and title plt.xlabel('Price on Last Day of Each Simulation') plt.ylabel('Number of Simulations') plt.title('Histogram of Simulations by Price on Last Day')
# Show the plot if show: plt.show() else: plt.savefig(graphfilename)
def readarg(flag, default=None): if flag in sys.argv: index = sys.argv.index(flag) if type(default) == bool: return True if len(sys.argv) >= (index + 1): result = sys.argv[index + 1] if type(default) == int: return int(result) else: return result return default
# main script
start_price = Decimal(str(sys.argv[1])) end_price = Decimal(str(sys.argv[2])) days = int(sys.argv[3]) simulations = readarg("-n", 10000) start_date = readarg("-s", STARTDATE) end_date = readarg("-e", ENDDATE) debug = readarg("--debug", DEBUG) store = readarg("--store", False) extended = readarg("--ext", False) graph = readarg("--graph", False) histogram = readarg("--hist", False) show = readarg("--show", False) # Shows the graph instead of storing it. Both doesn't seem to work :( startdate_sim_str = readarg("-d", None) # You can set an alternative start date than today, mostly for the graphs. sd = datetime.date.fromisoformat(startdate_sim_str) if startdate_sim_str else datetime.date.today() startdate_sim = datetime.datetime(sd.year, sd.month, sd.day)
base_price_series = SEP22OCT23
if CSV: price_series = csv_to_series(CSVFILE, start_date, end_date, debug) else: price_series = [Decimal((100 + i)/100) for i in base_price_series]
passed_sims = 0 all_sims = []
# Take into account that this loop does not add the start price to the sims. for sim in range(simulations):
price = start_price sim_days = [] for day in range(days): price_change = random.choice(price_series) price = price * price_change sim_days.append(price) if store or graph or histogram: all_sims.append(sim_days)
if debug: print("Simulation {}: {}".format(sim, sim_days)) for dprice in sim_days: if ((end_price > start_price and dprice >= end_price) or (start_price >= end_price and dprice <= end_price)): passed_sims += 1 break
probability = Decimal(passed_sims * 100 / simulations)
print("Price: {} in {} simulations".format(end_price, simulations)) print("Reached in {} simulations".format(passed_sims)) print("Probability: {} %".format(float(probability)))
if store: csv_filename = "{}-{}-{}-{}-{}-{}sims.csv".format(start_price, end_price, days, start_date, end_date, simulations) store_to_csv(all_sims, csv_filename, extended=extended)
if graph: graph_filename = "{}-{}-{}-{}-{}-{}sims.png".format(start_price, end_price, days, start_date, end_date, simulations) store_to_graph(all_sims, graph_filename, show=show)
if histogram: histogram_filename = "{}-{}-{}-{}-{}-{}sims-histogram.png".format(start_price, end_price, days, start_date, end_date, simulations) store_to_histogram(all_sims, histogram_filename, show=show)
|
|
|
|
|
DirtyKeyboard
|
 |
November 10, 2024, 03:15:44 AM Last edit: November 10, 2024, 08:56:11 AM by DirtyKeyboard |
|
I'd like to add a new approach. You know how they say past performance doesn't dictate future results. What if they're wrong?  Here is a script that asks how many days in the past, from today, you want to match the % change of BTC volume weighted average price since 2012. Then use that to predict the future. Step 1: go through the date range, and looking for a sequence of past_percent_changes that best matches the recent_percent_changes. Step 2: put the lists side by side and add the delta between each day in the sequence and use the sum of the deltas as a closeness ranking Step 3: move one day forward and look at the new closeness ranking of that past_percent_changes against the recent_percent_changes. Step 4: using the best matching past day's result dates, look at what happend next (in the past). Step 5: apply those next past percent changes to today's starting price sequentially to see what might happen in the future. So, for example, the best match for the past week is from 2023-06-18 to 2023-06-25, so what happened from the 26th on for the next 7 days? The Close score is the sum of the differences in the percent changes divided by the number of days chosen to match.  Close score: (0.79538) Range start date: 2023-06-18Close score: (0.79579) Range start date: 2021-09-28Close score: (0.88344) Range start date: 2023-01-18Close score: (0.95793) Range start date: 2020-04-20Close score: (0.96967) Range start date: 2020-04-03 Close score: (1.31444) Range start date: 2019-03-02Close score: (1.38094) Range start date: 2012-04-29Close score: (1.38704) Range start date: 2016-08-03Close score: (1.38887) Range start date: 2023-09-28Close score: (1.39045) Range start date: 2016-03-06I think these graphs are say, "Don't get scared next week, hold on for longer term profits!" # from datetime import datetime, timedelta import pyperclip import matplotlib.dates as mdates import pandas as pd import matplotlib.pyplot as plt import pandas as pd
start_date = '2012-01-01' #input("Enter start date (YYYY-MM-DD): ") end_date = '2024-11-09' #input("Enter end date (YYYY-MM-DD): ")
# Read the CSV file into a DataFrame df = pd.read_csv("C:/PyProjects/Predictor/result_with_timestamp.csv")
# Filter the DataFrame based on the user-defined date range df = df[(df['Column 2 Name'] >= start_date) & (df['Column 2 Name'] <= end_date)]
# Calculate the percent change between days df['Percent_Change'] = df['Column 1 Name'].pct_change() * 100
# Get user inputs days = int(input("Number of past days to match: "))
# Get the most recent sequence of percent changes recent_percent_changes = df['Percent_Change'].tail(days).dropna().tolist()
# Initialize a list to store closeness rankings closeness_rankings = []
# Iterate through the DataFrame to find matching sequences for i in range(len(df) - days + 1): # Get the current sequence of percent changes past_percent_changes = df['Percent_Change'].iloc[i:i + days].dropna().tolist() # Ensure we have a valid sequence if len(past_percent_changes) == days: # Calculate the deltas deltas = [abs(recent - past) for recent, past in zip(recent_percent_changes, past_percent_changes)] # Calculate the sum of deltas as the closeness ranking closeness_ranking = sum(deltas) # Store the result with the starting index closeness_rankings.append((i, closeness_ranking, past_percent_changes))
# Sort the closeness rankings by the ranking value closeness_rankings.sort(key=lambda x: x[1]) # Sort by closeness ranking
n = 1 post = [] colors = ['black', 'red', 'orange', 'blue', 'green'] for x in colors: if len(closeness_rankings) > 1: second_best_index, second_best_ranking, past_changes = closeness_rankings[n] starting_date = df['Column 2 Name'].iloc[second_best_index] else: print("Not enough closeness rankings found.") exit()
close_score = (second_best_ranking/days)
print(f'Closeness ({close_score}): {starting_date}') series = f'Close score: ({close_score:.5f}) [color={x}]Range start date: {starting_date}[/color]' post.append(series)
# Get the next percent changes starting from 'second_best_index + days' next_percent_changes = df['Percent_Change'].iloc[second_best_index + days:second_best_index + days + days].dropna().tolist()
# Set the starting price as the last price in the DataFrame starting_price = float(df['Column 1 Name'].iloc[-1])
# Initialize a list to store predicted prices predicted_prices = [starting_price]
# Simulate the price changes using the next percent changes for percent_change in next_percent_changes: new_price = predicted_prices[-1] * (1 + percent_change / 100) predicted_prices.append(new_price)
last_date = pd.to_datetime(df['Column 2 Name'].iloc[-1]) predicted_dates = pd.date_range(start=last_date, periods=len(predicted_prices), freq='D')
# Prepare data for saving to CSV predicted_df = pd.DataFrame({ 'Date': predicted_dates, 'Predicted_Price': predicted_prices })
# Save the predicted prices to a CSV file predicted_df.to_csv(f'C:/PyProjects/Predictor/predicted_simulation{n}.csv', index=False) n += 1
import pandas as pd import matplotlib.pyplot as plt
plt.figure(figsize=(12, 6))
# Function to plot data def plot_data(plt, file, label, color): results_df = pd.read_csv(file) results_df.columns = results_df.columns.str.strip() # Strip whitespace from column names results_df['Date'] = pd.to_datetime(results_df['Date']) # Convert 'Date' to datetime plt.plot(results_df['Date'], results_df['Predicted_Price'], label=label, color=color) return results_df # Return the DataFrame for later use
# Plot each predicted simulation and collect DataFrames dfs = [] dfs.append(plot_data(plt, 'C:/PyProjects/Predictor/predicted_simulation1.csv', 'Best Past Match', 'black')) dfs.append(plot_data(plt, 'C:/PyProjects/Predictor/predicted_simulation2.csv', '2nd Best', 'red')) dfs.append(plot_data(plt, 'C:/PyProjects/Predictor/predicted_simulation3.csv', '3rd Best', 'orange')) dfs.append(plot_data(plt, 'C:/PyProjects/Predictor/predicted_simulation4.csv', '4th Best', 'blue')) dfs.append(plot_data(plt, 'C:/PyProjects/Predictor/predicted_simulation5.csv', '5th Best', 'green'))
# Set x-axis limits based on the min and max dates from all DataFrames all_dates = pd.concat([df['Date'] for df in dfs]) plt.gca().set_xlim(left=all_dates.min(), right=all_dates.max())
# Add labels and title plt.xlabel('Date') plt.ylabel('Predicted Price') plt.title(f'Predicted Prices for Best Past Matches over {days} days in the past') plt.xticks(rotation=45) plt.grid() plt.tight_layout()
# Show the plot plt.legend() plt.show()
comment = f"{post}" print (comment) pyperclip.copy(comment)
Things got a bit messy trying to put everything on one graph. Edit: I can't seem to figure out why this script fails when the days are more than somewhere around 250. The script doesn't fail, it just doesn't output enough predicted_prices. It's so frunstrating...er fun and frustrating, that I can't find where things are going sideways. Any help? For example  Close score: (1.86067) Range start date: 2023-03-17Close score: (1.95362) Range start date: 2023-05-20Close score: (1.95761) Range start date: 2023-03-25Close score: (1.96330) Range start date: 2023-03-16Close score: (1.98632) Range start date: 2023-03-18Do you see where the red series ends early? To clarify, it ends early because the csv file ends early, so why does that happen?
|
██████████████████████████ ▟█ █████████████████████████████████████████████████████████████████ █████████████████████████ ▟██ ████████████████████████████████████████████████████████████████ ███████████████████████ ⚞▇▇▋███▎▎(⚬)> ███████████████████████████████████████████████████████████████ █████████████████████████ ▜██ ████████████████████████████████████████████████████████████████ ██████████████████████████ ▜█ █████████████████████████████████████████████████████████████████ ███████████████████████████ ███████████████████████████████████████████████████████████████████
|
|
|
d5000 (OP)
Legendary
Offline
Activity: 4508
Merit: 10106
Decentralization Maximalist
|
That's a quite different approach, but nevertheless interesting  It is already not so much a probability-based prediction, but a prediction by similarity. It's a bit like these Wall Observer users and "TA specialists" pasting an old chart into a new chart  It would be more close to a probability based simulation if you took the 100 most similar price performances instead of only 5, and simulated based on them. I'll see if I can play around a bit with that. Another idea would be to try to find Elliott wave-style movements - basically look for similarities in the the more long-term movements of a moving average like SMA or EMA. Edit: I can't seem to figure out why this script fails when the days are more than somewhere around 250. The script doesn't fail, it just doesn't output enough predicted_prices.
I've looked at it and it's always the same series, only shift by some days, and it's actually the last days of the price data (this is why the price row ends, as you probably already found out if I interpret your last sentence correctly). So it seems that despite of the days that are "missing", these are still the most similar price series. This means that other series are probably not similar at all. (no, this seems not to be the case, see below) I have found out that the last part of the algorithm (after the closeness rankings are calculated) seems not to work correctly, because if I list the most close price patterns (for 300 days) manually, I get dates from 2023, 2016 and 2022 (in this order by frequency), and these are not "cut off" by the limit date of the CSV file (in my case in October 2024).
For those wanting to copy the script it may also important to add that the original CSV pastebin has the column names "Price" (col 1) and "Date" (column 2), instead of "Column 1 Name" and "Column 2 Name", so you must replace these items. The "paperclip" module can also imo be left out as this is not part of the standard Python library and not needed for the simulations. Just comment out the last line (or the last 3 if you don't need the "comment" part).
|
|
|
|
|
DirtyKeyboard
|
 |
November 11, 2024, 04:49:44 AM Last edit: November 11, 2024, 06:21:56 AM by DirtyKeyboard |
|
That's a quite different approach, but nevertheless interesting  It is already not so much a probability-based prediction, but a prediction by similarity. It's a bit like these Wall Observer users and "TA specialists" pasting an old chart into a new chart  It would be more close to a probability based simulation if you took the 100 most similar price performances instead of only 5, and simulated based on them. I'll see if I can play around a bit with that. Another idea would be to try to find Elliott wave-style movements - basically look for similarities in the the more long-term movements of a moving average like SMA or EMA. >edit< Thanks. Thank you for looking at it.  Here's a lot cleaner version with the correct headers, and using home directory filenames. It only attempts 1 graph. Still haven't found why it goes haywire with more than around 250 days. import matplotlib.pyplot as plt import pandas as pd start_date = '2012-01-01' #input("Enter start date (YYYY-MM-DD): ") end_date = '2024-11-09' #input("Enter end date (YYYY-MM-DD): ")
df = pd.read_csv("PriceDate.csv") df['Percent_Change'] = df['Price'].pct_change() * 100 days = int(input("Number of past days to match: ")) recent_percent_changes = df['Percent_Change'].tail(days).dropna().tolist()
closeness_rankings = [] tries = len(df) - days for i in range(0, tries): past_percent_changes = df['Percent_Change'].iloc[i:i + days].tolist() deltas = [abs(recent - past) for recent, past in zip(recent_percent_changes, past_percent_changes)] closeness_ranking = sum(deltas) closeness_rankings.append((i, closeness_ranking, past_percent_changes)) closeness_rankings.sort(key=lambda x: x[1]) second_best_index, second_best_ranking, past_changes = closeness_rankings[1] starting_date = df['Date'].iloc[second_best_index] print(starting_date) df.to_csv(f'PercentChange.csv', index=False) next_percent_changes = df['Percent_Change'].iloc[second_best_index + days:second_best_index + days + days].tolist() starting_price = float(df['Price'].iloc[-1]) predicted_prices = [starting_price]
for percent_change in next_percent_changes: new_price = predicted_prices[-1] * (1 + percent_change / 100) predicted_prices.append(new_price)
last_date = pd.to_datetime(df['Date'].iloc[-1]) predicted_dates = pd.date_range(start=last_date, periods=len(predicted_prices)) predicted_df = pd.DataFrame({'Date': predicted_dates,'Predicted_Price': predicted_prices}) predicted_df.to_csv('try_two_simulation.csv', index=False)
plt.figure(figsize=(12, 6)) def plot_data(plt, file, label, color): results_df = pd.read_csv(file) results_df.columns = results_df.columns.str.strip() results_df['Date'] = pd.to_datetime(results_df['Date']) plt.plot(results_df['Date'], results_df['Predicted_Price'], label=label, color=color) return results_df
dfs = [] dfs.append(plot_data(plt, 'try_two_simulation.csv', 'Best Past Match', 'black'))
all_dates = pd.concat([df['Date'] for df in dfs]) plt.gca().set_xlim(left=all_dates.min(), right=all_dates.max()) plt.xlabel('Date') plt.ylabel('Predicted Price') plt.title(f'Predicted Prices for Best Past Matches over {days} days in the past') plt.xticks(rotation=45) plt.grid() plt.tight_layout() plt.legend() plt.show()
Here's that pastebin link again with the historical volume weighted average prices. I've fixed the linked paste, but while looking for anything that could be going wrong, I found duplicate entries for 63554.119868161404,2024-09-23. So that should be fixed with any local copies. https://pastebin.com/3C67XqKWhttps://pastebin.com/3C67XqKW I used a while loop to run multiple runs for every range of dates that have a close score average of less than 2. That means that for each pair of dates in the present and past ranges, the percent change from day to day differs by an average of less than 2% points. Like if the 2nd day in the past range was a -1% change, and the present range was a +1% change, that is a close score of 2 for that day. I figured out we could test this model so I set the range to be from 2012-01-01 to 2024-10-11 and asked it for 30 days of matching(or close to) % change day to day going back from October 11th, to predict prices up to today. . .. .I coulda called the spike to 80k!  Edit: So what about the next 30 days?   $100k here we come! 
|
██████████████████████████ ▟█ █████████████████████████████████████████████████████████████████ █████████████████████████ ▟██ ████████████████████████████████████████████████████████████████ ███████████████████████ ⚞▇▇▋███▎▎(⚬)> ███████████████████████████████████████████████████████████████ █████████████████████████ ▜██ ████████████████████████████████████████████████████████████████ ██████████████████████████ ▜█ █████████████████████████████████████████████████████████████████ ███████████████████████████ ███████████████████████████████████████████████████████████████████
|
|
|
d5000 (OP)
Legendary
Offline
Activity: 4508
Merit: 10106
Decentralization Maximalist
|
 |
November 17, 2024, 11:11:18 PM Last edit: November 17, 2024, 11:34:13 PM by d5000 Merited by DirtyKeyboard (1) |
|
I think I found the problem, but still am looking how to solve it. The problem is this line: next_percent_changes = df['Percent_Change'].iloc[second_best_index + days:second_best_index + days + days].dropna().tolist() If I chose 300 days and the second_best_index is too close to the present date, then second_best_index + 300 + 300 will return a number outside of the DataFrame, so the returned series will be shortened. For example, I get the following dates for first three rankings: - 2023-04-12 - 2023-04-25 - 2023-02-07 The limit of my original CSV file is 2024-10-26. 2023-04-12 + 600 days is: 2024-12-02 2023-04-25 + 600 days is: 2024-12-15 2023-02-07 + 600 days is: 2024-09-29 So only for the third rank the complete 300 days series will be taken, and I can confirm it from the graph. The problem is thus, that the script searches for this timeframe in the current DataFrame, instead of appending more days to the DataFrame.
I think I found the fix. Simply delete the line I mentioned above (or comment it out) and replace all instances of the variable next_percent_changes with past_changes. I get the following image in this case (expanded it to 7, because the first six are quite close but the seventh is from 2016): 
|
|
|
|
d5000 (OP)
Legendary
Offline
Activity: 4508
Merit: 10106
Decentralization Maximalist
|
 |
November 24, 2024, 02:46:26 AM Last edit: November 24, 2024, 04:48:55 AM by d5000 Merited by DirtyKeyboard (1) |
|
I made actually a mistake in my last post. The script behaved correctly, however if you project data 300 days in the future for example, and the start date is less than 300 days before the last date in the CSV file, then of course it will project less than 300 days into the future  I've made some more improvements to the "similarity" script, but still one part of your last version is missing. The improvements include: first, you can select how many days you want to project into the future. This number in the previous script was always equal to the length of the timeframe you compared to the past changes. Second, you can choose the script to ignore values which are too close together, which was a problem in the last version where a lot of 2023 values were very close. A third improvement: the startdate of each similar timeframe is shown in the chart. - takes arguments -d, --diverse and -n - -d are the days to calculate (if you want the input method, just uncomment the commented line) - --diverse ensures that the dates aren't too close together. The div_days variable (default: 60 days) defines the minimum distance between two of the predictions. - -n is the length of the prediction. In the original script, the prediction is The commented pyperclip part is for the "post" feature and can of course also be uncommented if needed.
# import pyperclip import datetime, sys import matplotlib.dates as mdates import pandas as pd import matplotlib.pyplot as plt import pandas as pd
# constants CSVFILE = "prices23nov24.csv" start_date = '2012-01-01' #input("Enter start date (YYYY-MM-DD): ") end_date = '2024-11-23' #input("Enter end date (YYYY-MM-DD): ") diversity = True # diversity mode ensures the dates aren't too close to each other div_days = 60 next_days = 60
def readarg(flag, default=None): if flag in sys.argv: if type(default) == bool: return True index = sys.argv.index(flag) if len(sys.argv) >= (index + 1): result = sys.argv[index + 1] if type(default) == int: return int(result) else: return result return default
def toocloseto(date_raw, ranks, days_raw): # for diversity mode: can't be too close date = datetime.date.fromisoformat(date_raw) days = datetime.timedelta(days_raw) # print("Comparing dates:", date_raw, ranks) for rkdate_raw in ranks: rkdate = datetime.date.fromisoformat(rkdate_raw) diff = abs(date - rkdate) if diff < days: # print("Too close:", diff) return True return False
# Function to plot data def plot_data(plt, file, label, color): results_df = pd.read_csv(file) results_df.columns = results_df.columns.str.strip() # Strip whitespace from column names results_df['Date'] = pd.to_datetime(results_df['Date']) # Convert 'Date' to datetime plt.plot(results_df['Date'], results_df['Predicted_Price'], label=label, color=color) return results_df # Return the DataFrame for later use
# ARGUMENTS # days: length of the timeframe to compare, in days days = readarg("-d", 90) # diverse: ensure that the dates aren't too close one to another (default minimum difference: 60 days, see div_days constant above) diversity = readarg("--diverse", False) # next_days: length of the timeframe to add. next_days = readarg("-n", 90)
# Read the CSV file into a DataFrame df = pd.read_csv(CSVFILE)
# Filter the DataFrame based on the user-defined date range df = df[(df['Date'] >= start_date) & (df['Date'] <= end_date)]
# Calculate the percent change between days df['Percent_Change'] = df['Price'].pct_change() * 100
# Get user inputs # days = int(input("Number of past days to match: "))
# Get the most recent sequence of percent changes recent_percent_changes = df['Percent_Change'].tail(days).dropna().tolist()
# Initialize a list to store closeness rankings closeness_rankings = []
# Iterate through the DataFrame to find matching sequences for i in range(len(df) - days + 1): # Get the current sequence of percent changes past_percent_changes = df['Percent_Change'].iloc[i:i + days].dropna().tolist() past_dates = df['Date'].iloc[i:i + days].dropna().tolist()
# Ensure we have a valid sequence if len(past_percent_changes) == days:
# Calculate the deltas deltas = [abs(recent - past) for recent, past in zip(recent_percent_changes, past_percent_changes)]
# Calculate the sum of deltas as the closeness ranking closeness_ranking = sum(deltas) # print(past_dates[0], past_dates[-1], closeness_ranking)
# Store the result with the starting index closeness_rankings.append((i, closeness_ranking, past_percent_changes, past_dates[0], past_dates[-1]))
# Sort the closeness rankings by the ranking value closeness_rankings.sort(key=lambda x: x[1]) # Sort by closeness ranking
n = 1 # n cycles through the ranks of the original dataframe. m = 1 # m cycles through the actual ranks post = [] colors = ['black', 'red', 'orange', 'blue', 'green', 'yellow', 'grey'] bestranks = [] for x in colors: if len(closeness_rankings) > 1: while True: second_best_index, second_best_ranking, past_changes = closeness_rankings[n][:3] starting_date = df['Date'].iloc[second_best_index] if diversity and toocloseto(starting_date, bestranks, div_days): # print(starting_date, "too close to the date of one of the selected best ranks") n += 1 continue else: # Get the next percent changes starting from 'second_best_index + days' next_percent_changes = df['Percent_Change'].iloc[second_best_index + days:second_best_index + days + next_days].dropna().tolist() if len(next_percent_changes) < next_days: print(f"Series too short: {starting_date}, {len(next_percent_changes)} days.") n += 1 continue bestranks.append(starting_date) break else: print("Not enough closeness rankings found.") exit()
print("Rank", n) close_score = (second_best_ranking/days)
print(f'Closeness ({close_score}): {starting_date}') # series = f'Close score: ({close_score:.5f}) [color={x}]Range start date: {starting_date}[/color]' # post.append(series)
# Set the starting price as the last price in the DataFrame starting_price = float(df['Price'].iloc[-1])
# Initialize a list to store predicted prices predicted_prices = [starting_price]
# Simulate the price changes using the next percent changes for percent_change in next_percent_changes: new_price = predicted_prices[-1] * (1 + percent_change / 100) predicted_prices.append(new_price)
last_date = pd.to_datetime(df['Date'].iloc[-1]) predicted_dates = pd.date_range(start=last_date, periods=len(predicted_prices), freq='D')
# Prepare data for saving to CSV predicted_df = pd.DataFrame({ 'Date': predicted_dates, 'Predicted_Price': predicted_prices })
# Save the predicted prices to a CSV file predicted_df.to_csv(f'predicted_simulation{m}.csv', index=False) n += 1 m += 1
# Test: for i in range(7): print("Rank", i, bestranks[i])
plt.figure(figsize=(12, 6))
# Plot each predicted simulation and collect DataFrames dfs = [] dfs.append(plot_data(plt, 'predicted_simulation1.csv', 'Best Past Match: start {}'.format(bestranks[0]), 'black')) dfs.append(plot_data(plt, 'predicted_simulation2.csv', '2nd Best: start {}'.format(bestranks[1]), 'red')) dfs.append(plot_data(plt, 'predicted_simulation3.csv', '3rd Best: start {}'.format(bestranks[2]), 'orange')) dfs.append(plot_data(plt, 'predicted_simulation4.csv', '4th Best: start {}'.format(bestranks[3]), 'blue')) dfs.append(plot_data(plt, 'predicted_simulation5.csv', '5th Best: start {}'.format(bestranks[4]), 'green')) dfs.append(plot_data(plt, 'predicted_simulation6.csv', '6th Best: start {}'.format(bestranks[5]), 'yellow')) dfs.append(plot_data(plt, 'predicted_simulation7.csv', '7th Best: start {}'.format(bestranks[6]), 'grey'))
# Set x-axis limits based on the min and max dates from all DataFrames all_dates = pd.concat([df['Date'] for df in dfs]) plt.gca().set_xlim(left=all_dates.min(), right=all_dates.max())
# Add labels and title plt.xlabel('Date') plt.ylabel('Predicted Price') plt.title(f'Predicted Prices for Best Past Matches over {days} days in the past (prediction length: {next_days} days)') plt.xticks(rotation=45) plt.grid() plt.tight_layout()
# Show the plot plt.legend() plt.show()
# comment = f"{post}" # print(comment) # pyperclip.copy(comment)
Some images: Best similarity matches for 300 days, projected 90 days into the future:  300 days, but with the --diversity argument (no dates closer than 60 days together):  Interesting that most are uber bullish, but there's always a bearish option.
Edited: There were different images and a slightly different script in an earlier version of this post. The script contained a bug but now it should work fine.
|
|
|
|
d5000 (OP)
Legendary
Offline
Activity: 4508
Merit: 10106
Decentralization Maximalist
|
Now that we've seen the first serious dip after last year's pump, I have again made some simulations. The scripts are unchanged (compared to last year). I want to thank @DirtyKeyboard seems to still update the CSV data regularly! 1. Probability by similarity: Here we see the most similar price movements recorded to the past, and how the price evolved after that -- projected to the current price. Interestingly, all similar dates are from 2016.  We see that in the past when similar movements were recorded, the price always seems to have recovered, and despite of the dip all variants close above 100.000$ in 90 days  2. Original approach: Probability to close above 100.000 in the next 90 days (based on data from the last full cycle, i.e. 2018-22), taking 83000$ as base (yesterday near the average price): 60,5% The individual simulations can be seen in these graphs:   The most bearish variant still predicts more than 33.000$, and the most bullish over 265.000$, with most simulations closing near or above 100.000$. [1] Doesn't look that bad  [1] Interestingly, the maximum of the histogram is actually close to the current price (80-90k area). It seems though that the accumulated frequency of closing prices over 100.000 is much higher. While there are some slightly bearish to sideways scenarios, the probability of a real crash down to 50k or lower is seen as very low.
|
|
|
|
|
DirtyKeyboard
|
 |
March 03, 2025, 06:30:39 AM |
|
Now that we've seen the first serious dip after last year's pump, I have again made some simulations. The scripts are unchanged (compared to last year). I want to thank @DirtyKeyboard seems to still update the CSV data regularly! 1. Probability by similarity: Here we see the most similar price movements recorded to the past, and how the price evolved after that -- projected to the current price. Interestingly, all similar dates are from 2016.  We see that in the past when similar movements were recorded, the price always seems to have recovered, and despite of the dip all variants close above 100.000$ in 90 days  2. Original approach: Probability to close above 100.000 in the next 90 days (based on data from the last full cycle, i.e. 2018-22), taking 83000$ as base (yesterday near the average price): 60,5% The individual simulations can be seen in these graphs:   The most bearish variant still predicts more than 33.000$, and the most bullish over 265.000$, with most simulations closing near or above 100.000$. [1] Doesn't look that bad  [1] Interestingly, the maximum of the histogram is actually close to the current price (80-90k area). It seems though that the accumulated frequency of closing prices over 100.000 is much higher. While there are some slightly bearish to sideways scenarios, the probability of a real crash down to 50k or lower is seen as very low. Looking good! So hard to account for all the variables.  But, I was wondering on the #2 approach graphs, what the histogram would look like, if the title was "Histogram of Sims by Max Price per Run" instead of on the last day? I ask because I was considering trying a script that wouldn't trade automatically, but might help one to set their buy/sell ladders.
|
██████████████████████████ ▟█ █████████████████████████████████████████████████████████████████ █████████████████████████ ▟██ ████████████████████████████████████████████████████████████████ ███████████████████████ ⚞▇▇▋███▎▎(⚬)> ███████████████████████████████████████████████████████████████ █████████████████████████ ▜██ ████████████████████████████████████████████████████████████████ ██████████████████████████ ▜█ █████████████████████████████████████████████████████████████████ ███████████████████████████ ███████████████████████████████████████████████████████████████████
|
|
|
d5000 (OP)
Legendary
Offline
Activity: 4508
Merit: 10106
Decentralization Maximalist
|
 |
March 03, 2025, 10:55:16 PM Last edit: March 04, 2025, 12:23:57 AM by d5000 |
|
But, I was wondering on the #2 approach graphs, what the histogram would look like, if the title was "Histogram of Sims by Max Price per Run" instead of on the last day?
The original script actually does this, only the histogram variant used the last price. I changed the script now to include a --max option for the histogram, and fed it with the following data: - Start price: 85000 - End price (to reach): 100000 - Days: 30 - Time interval: 2018 to 2021 (last full 4-year cycle) The probability to be surpassed was 35.95 % for 10000 simulations. The histograms obviously look very different: for the --max option, of course no price went to less than the start price (85000), as the first day is of course part of the row  Using the last price:  Using the max price:  Title of the histogram graph has still not been changed so it's currently misleading (in the second graph it should be "based on Max Price" instead of "by price at Last Day"). (Edit: This was fixed.) I'll make some more modifications to the script soon as I got some new ideas, for example the probability for the last price to reach the end price would also be interesting.
Code: import sys, random, csv, datetime from decimal import Decimal import pandas as pd import matplotlib.pyplot as plt import matplotlib.dates as mdates """Usage:
python price_probabilities.py STARTPRICE TARGET_PRICE DAYS [-n SIMULATIONS] [-s STARTDATE_SERIES] [-e ENDDATE_SERIES] [--debug] [--store] [--ext] [--graph] [--hist] [--max]
If the STARTPRICE is below the TARGET_PRICE, then it calculates the probability to surpass that price. If the STARTPRICE is instead equal or above the TARGET_PRICE, then it calculates the probability to go below that price (useful to roughly estimate a crash risk)."""
# default values CSV = True CSVFILE = "prices.csv" STARTDATE = "2018-01-01" ENDDATE = "2021-12-31" SIMULATIONS = 10000 DEBUG = False
# series from sept 22 to oct 23, 2024
SEP22OCT23 = [0.92, -1.13, 0.08, -2.40, 0.96, -0.07, 1.49, -0.30, 0.81, 1.51, 5.11, -0.53, 1.13, 3.62, -0.52, -2.46, -0.14, -0.91, 1.22, -0.03, 2.19, 0.17, -0.31, -3.95, -3.46, -0.39, 0.14, 0.92, 3.20, -1.71, 1.46, -0.38]
def csv_to_series(csvfilename, startdatestr, enddatestr, debug=False): startdate = datetime.date.fromisoformat(startdatestr) enddate = datetime.date.fromisoformat(enddatestr) with open(csvfilename, "r") as csvfile: reader = csv.DictReader(csvfile) prices = [r for r in reader] change_series = [] prev_price = None for day_data in prices: date = datetime.date.fromisoformat(day_data["Date"]) if date < startdate: continue if date > enddate: break
price = Decimal(str(day_data["Price"])) if prev_price is None: if debug: print(price) prev_price = price continue change = Decimal(price / prev_price) change_series.append(change) prev_price = price if debug: print(price, change) if debug: print(change_series) return change_series
def store_to_csv(data, csvfilename, extended=False): with open(csvfilename, "w") as csvfile: writer = csv.writer(csvfile) if extended: writer.writerow(['Simulation', 'Day', 'Price']) for sim, row in enumerate(data): for day, result in enumerate(row): writer.writerow([sim + 1, day + 1, round(float(result), 2)]) else: writer.writerow(["Day {}".format(d + 1) for d in range(days)]) for row in data: writer.writerow([round(float(day), 2) for day in row])
def create_dataframe(data):
startprice_sim = round(float(start_price), 2) data_list = [] for sim, days in enumerate(data): data_list.append({"Simulation" : sim, "Date" : startdate_sim, "Price" : startprice_sim}) for day, price in enumerate(days): date = startdate_sim + datetime.timedelta(days=day + 1) data_list.append({"Simulation" : sim, "Date" : date, "Price" : price})
return pd.DataFrame(data_list)
def store_to_graph(data, graphfilename, show=False): # Create a figure and axis for the plot plt.figure(figsize=(12, 6)) dataframe = create_dataframe(data)
for sim in dataframe["Simulation"].unique(): sim_data = dataframe[dataframe['Simulation'] == sim] plt.plot(sim_data['Date'], sim_data['Price'], marker='o') # Group by simulation number
max_date, min_date = dataframe["Date"].max(), dataframe["Date"].min()
# Add labels and title plt.xlabel('Date') plt.ylabel('Price') plt.title('Price vs Date for Each Simulation') plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m')) plt.xticks(rotation=45) # Rotate x-axis labels for better readability plt.grid() # Add grid for better readability plt.tight_layout() plt.gca().set_xlim(left=min_date, right=max_date) plt.subplots_adjust(left=0.1, right=0.9, top=0.9, bottom=0.15)
# Mirror the y-axis on the right ax = plt.gca() ax2 = ax.twinx() ax2.set_ylabel('Price') ax2.yaxis.set_label_position('right') ax2.yaxis.tick_right() ax2.set_ylim(ax.get_ylim())
# Export the plot if show: plt.show() else: plt.savefig(graphfilename)
def store_to_histogram(data, graphfilename, show=False, usemax=False):
# Create a figure and axis for the plot plt.figure(figsize=(12, 6)) dataframe = create_dataframe(data) # results_df = pd.read_csv('C:/PyProjects/Predictor/simulation_results.csv')
# Extract the last price for each simulation # modif: usemax uses the maximum price reached, not the last price. if usemax: prices = dataframe.groupby('Simulation')['Price'].max() else: prices = dataframe.groupby('Simulation')['Price'].last()
# Create bins for the price ranges bins = range(int(prices.min()), int(prices.max()) + 1000, 1000)
# Create a histogram of the last / max prices plt.figure(figsize=(12, 6)) counts, _, patches = plt.hist(prices, bins=bins, edgecolor='black')
# Calculate min, max, and mode min_price = prices.min() max_price = prices.max() mode_price = prices.mode()[0] # Get the first mode if there are multiple
# Label the columns with min, max, and mode prices plt.text(min_price, counts.max() * 0.9, f'Min: {min_price:.0f}', color='blue', fontsize=13) plt.text(max_price - len(bins)*150, counts.max() * 0.9, f'Max: {max_price:.0f}', color='green', fontsize=13) # plt.text(mode_price, counts.max() * 0.8, f'Mode: {mode_price}', color='green', fontsize=10)
# Set labels and title if usemax: x_label = 'Maximum Price of Each Simulation' hist_title = 'Histogram of Simulations by Max Price' else: x_label = 'Price on Last Day of Each Simulation' hist_title = 'Histogram of Simulations by Price on Last Day' plt.xlabel(x_label) plt.ylabel('Number of Simulations') plt.title(hist_title)
# Show the plot if show: plt.show() else: plt.savefig(graphfilename)
def readarg(flag, default=None): if flag in sys.argv: index = sys.argv.index(flag) if type(default) == bool: return True if len(sys.argv) >= (index + 1): result = sys.argv[index + 1] if type(default) == int: return int(result) else: return result return default
# main script
start_price = Decimal(str(sys.argv[1])) end_price = Decimal(str(sys.argv[2])) days = int(sys.argv[3]) simulations = readarg("-n", 10000) start_date = readarg("-s", STARTDATE) end_date = readarg("-e", ENDDATE) debug = readarg("--debug", DEBUG) store = readarg("--store", False) extended = readarg("--ext", False) graph = readarg("--graph", False) histogram = readarg("--hist", False) usemax = readarg("--max", False) show = readarg("--show", False) # Shows the graph instead of storing it. Both doesn't seem to work :( startdate_sim_str = readarg("-d", None) # You can set an alternative start date than today, mostly for the graphs. sd = datetime.date.fromisoformat(startdate_sim_str) if startdate_sim_str else datetime.date.today() startdate_sim = datetime.datetime(sd.year, sd.month, sd.day)
base_price_series = SEP22OCT23
if CSV: price_series = csv_to_series(CSVFILE, start_date, end_date, debug) else: price_series = [Decimal((100 + i)/100) for i in base_price_series]
passed_sims = 0 all_sims = []
# Take into account that this loop does not add the start price to the sims. for sim in range(simulations):
price = start_price sim_days = [] for day in range(days): price_change = random.choice(price_series) price = price * price_change sim_days.append(price) if store or graph or histogram: all_sims.append(sim_days)
if debug: print("Simulation {}: {}".format(sim, sim_days)) for dprice in sim_days: if ((end_price > start_price and dprice >= end_price) or (start_price >= end_price and dprice <= end_price)): passed_sims += 1 break
probability = Decimal(passed_sims * 100 / simulations)
print("Price: {} in {} simulations".format(end_price, simulations)) print("Reached in {} simulations".format(passed_sims)) print("Probability: {} %".format(float(probability)))
if store: csv_filename = "{}-{}-{}-{}-{}-{}sims.csv".format(start_price, end_price, days, start_date, end_date, simulations) store_to_csv(all_sims, csv_filename, extended=extended)
if graph: graph_filename = "{}-{}-{}-{}-{}-{}sims.png".format(start_price, end_price, days, start_date, end_date, simulations) store_to_graph(all_sims, graph_filename, show=show)
if histogram: if usemax: histogram_filename = "{}-{}-{}-{}-{}-{}sims-maxprices-histogram.png".format(start_price, end_price, days, start_date, end_date, simulations) else: histogram_filename = "{}-{}-{}-{}-{}-{}sims-lastprices-histogram.png".format(start_price, end_price, days, start_date, end_date, simulations) store_to_histogram(all_sims, histogram_filename, show=show, usemax=usemax)
|
|
|
|
buwaytress
Legendary
Offline
Activity: 3402
Merit: 4093
I bit therefore I am
|
 |
March 05, 2025, 02:14:11 PM |
|
A little over my head, as things tend to do more often these days but nevertheless, cool beans. Methinks someone should run the numbers through a bot to try and see actual results. Maybe even on a weaker currency (ringgit, baht maybe?) to rack up them zeroes.
A few years ago with more time on my hand I would just feed the numbers into the Bitcoin Up or Down games casinos now have (oh except the edge is so high so boo). That would be the simplest test of this no?
|
|
|
|
|
|
| . betpanda.io | │ |
ANONYMOUS & INSTANT .......ONLINE CASINO....... | │ | ▄███████████████████████▄ █████████████████████████ █████████████████████████ ████████▀▀▀▀▀▀███████████ ████▀▀▀█░▀▀░░░░░░▄███████ ████░▄▄█▄▄▀█▄░░░█▄░▄█████ ████▀██▀░▄█▀░░░█▀░░██████ ██████░░▄▀░░░░▐░░░▐█▄████ ██████▄▄█░▀▀░░░█▄▄▄██████ █████████████████████████ █████████████████████████ █████████████████████████ ▀███████████████████████▀ | ▄███████████████████████▄ █████████████████████████ ██████████▀░░░▀██████████ █████████░░░░░░░█████████ ████████░░░░░░░░░████████ ████████░░░░░░░░░████████ █████████▄░░░░░▄█████████ ███████▀▀▀█▄▄▄█▀▀▀███████ ██████░░░░▄░▄░▄░░░░██████ ██████░░░░█▀█▀█░░░░██████ ██████░░░░░░░░░░░░░██████ █████████████████████████ ▀███████████████████████▀ | ▄███████████████████████▄ █████████████████████████ ██████████▀▀▀▀▀▀█████████ ███████▀▀░░░░░░░░░███████ ██████▀░░░░░░░░░░░░▀█████ ██████░░░░░░░░░░░░░░▀████ ██████▄░░░░░░▄▄░░░░░░████ ████▀▀▀▀▀░░░█░░█░░░░░████ ████░▀░▀░░░░░▀▀░░░░░█████ ████░▀░▀▄░░░░░░▄▄▄▄██████ █████░▀░█████████████████ █████████████████████████ ▀███████████████████████▀ | .
SLOT GAMES ....SPORTS.... LIVE CASINO | │ | ▄░░▄█▄░░▄ ▀█▀░▄▀▄░▀█▀ ▄▄▄▄▄▄▄▄▄▄▄ █████████████ █░░░░░░░░░░░█ █████████████ ▄▀▄██▀▄▄▄▄▄███▄▀▄ ▄▀▄██▄███▄█▄██▄▀▄ ▄▀▄█▐▐▌███▐▐▌█▄▀▄ ▄▀▄██▀█████▀██▄▀▄ ▄▀▄█████▀▄████▄▀▄ ▀▄▀▄▀█████▀▄▀▄▀ ▀▀▀▄█▀█▄▀▄▀▀ | Regional Sponsor of the Argentina National Team |
|
|
|
d5000 (OP)
Legendary
Offline
Activity: 4508
Merit: 10106
Decentralization Maximalist
|
 |
May 21, 2025, 03:26:07 AM Last edit: May 21, 2025, 03:37:16 AM by d5000 |
|
I have fed the scripts with new data. First the question: Will there be a new ATH in the next 14 days, starting from 106,000$? The script returns a 66% probability that yes, i.e. that it will reach 110,000 (and we have already seen 107k). Second question: Will the price reach $130,000 in 90 days? This looks like a reasonable next step. Script says: 56,73% probability with 10,000 simulations. Histogram for the last price:  Looks very slightly bearish at a first glance because the maximum of the last price is about 100k. But this would mean that it's possible that a new ATH is achieved before. Histogram for the max price:  As I think I already mentioned, the lowest bar is the highest because the max price can't be below the start price, so this bar shows all completely bearish simulations (those never reaching more than the starting price), which are only ~600 from 10000. Graphs for 100 simulations:  I also am currently experimenting with additional parameters to adjust volatility. As we're experiencing a declining volatility over the last years, it's possible that the data of the simulation exaggerates the possible volatility, so a knob to adjust it imo makes sense.
Oh, I totally oversaw this answer: Methinks someone should run the numbers through a bot to try and see actual results. Maybe even on a weaker currency (ringgit, baht maybe?) to rack up them zeroes.
A few years ago with more time on my hand I would just feed the numbers into the Bitcoin Up or Down games casinos now have (oh except the edge is so high so boo). That would be the simplest test of this no?
I guess that would be an interesting experiment. Simply betting to "up or down" however would lose information, I think. You could however of course allocate long and short positions and closing them according to the simulations' outcome. For example, if there are 60% bullish outcomes, buy 60 units of BTC (long) and at the same price short-sell 40 units (short), setting the sell / re-buy price at the maximum or minimum price the simulation gave. Example with 2 simulations (1 bullish and 1 bearish one): 106000, 107000, 108000, 105000, 106000, 112000, 114000, 107000 => buy at 106000 and sell at 114000 if this price is reached 106000, 107000, 103000, 99000, 95000, 99000, 102000 => short-sell at 106000, re-buy at 95000
|
|
|
|
|