Identifying Arbitrage Opportunities: Matching Oddschecker Data with Betfair Exchange
In the world of sports betting and political forecasting, arbitrage opportunities arise when there are discrepancies in odds between different bookmakers or betting exchanges. By comparing odds from Oddschecker (which aggregates traditional bookmaker odds) with those from the Betfair Exchange, we can potentially uncover these profitable discrepancies. In this article, we’ll explore how to use Python to match data from these two sources and identify potential arbitrage opportunities.
Setting Up the Environment
Before we dive into the arbitrage identification process, let’s set up our Python environment:
import betfairlightweight
from betfairlightweight import filters
from betfairlightweight import APIClient
import pandas as pd
from dotenv import load_dotenv
import os
# Load login credentials
load_dotenv()
bf_usr = os.getenv("BF_LOGIN")
bf_pass = os.getenv("BF_PASS")
bf_api = os.getenv("BF_API_KEY")
bf_certs_path = '../certs/'
# Login to Betfair client
client = APIClient(bf_usr, bf_pass, app_key=bf_api, certs=bf_certs_path)
client.login()
Fetching and Processing Betfair Data
To identify arbitrage opportunities, we first need to fetch the latest odds from Betfair:
def fetch_betfair_data(event_type_id='2378961'): # Politics event type
market_filter = betfairlightweight.filters.market_filter(
event_type_ids=[event_type_id],
)
market_catalogues = client.betting.list_market_catalogue(
filter=market_filter,
max_results=100,
market_projection=['RUNNER_DESCRIPTION']
)
market_ids = [market.market_id for market in market_catalogues]
selection_mapping = {runner.selection_id: runner.runner_name
for market in market_catalogues
for runner in market.runners}
market_name_mapping = {market.market_id: market.market_name
for market in market_catalogues}
price_filter = betfairlightweight.filters.price_projection(
price_data=['EX_BEST_OFFERS']
)
betfair_data = []
for market_id in market_ids:
market_books = client.betting.list_market_book(
market_ids=[market_id],
price_projection=price_filter
)
if market_books:
market_book = market_books[0]
market_name = market_name_mapping[market_id]
for runner in market_book.runners:
selection_name = selection_mapping.get(runner.selection_id, "Unknown")
best_back = runner.ex.available_to_back[0]['price'] if runner.ex.available_to_back else None
best_lay = runner.ex.available_to_lay[0]['price'] if runner.ex.available_to_lay else None
betfair_data.append({
'Market ID': market_id,
'Market Name': market_name,
'Selection Name': selection_name,
'Best Back Price': best_back,
'Best Lay Price': best_lay,
})
return pd.DataFrame(betfair_data)
betfair_df = fetch_betfair_data()
Matching with Oddschecker Data
Next, we need to match this Betfair data with our previously scraped Oddschecker data. It’s possible to automate the mapping between the markets based on something like cosine similarity, but it’s simpler to manually create a mapping table between the two given the markets are somewhat static.
def match_betfair_oddschecker(betfair_df, oddschecker_df, mapping_df):
betfair_df['Selection Name Normalized'] = betfair_df['Selection Name'].str.lower().str.strip()
oddschecker_df['Bet Normalized'] = oddschecker_df['Bet'].str.lower().str.strip()
all_results = []
for _, row in mapping_df.iterrows():
market_id, url = row['Market ID'], row['URL']
if pd.isna(url) or url.strip() == "":
continue
betfair_filtered = betfair_df[betfair_df['Market ID'] == market_id]
oddschecker_filtered = oddschecker_df[oddschecker_df['URL'] == url]
if betfair_filtered.empty or oddschecker_filtered.empty:
continue
bet_choices = oddschecker_filtered['Bet Normalized'].tolist()
matches = betfair_filtered['Selection Name Normalized'].apply(
lambda x: process.extractOne(x, bet_choices, scorer=fuzz.token_sort_ratio))
betfair_filtered['Best Match Bet'] = matches.apply(lambda x: x[0])
betfair_filtered['Similarity Score'] = matches.apply(lambda x: x[1])
merged = pd.merge(betfair_filtered[betfair_filtered['Similarity Score'] >= 80],
oddschecker_filtered,
left_on=['Best Match Bet'],
right_on=['Bet Normalized'],
how='left')
all_results.append(merged)
return pd.concat(all_results, ignore_index=True)
matched_df = match_betfair_oddschecker(betfair_df, oddschecker_df, mapping_df)
Identifying Arbitrage Opportunities
Now that we have matched data from both sources, we can identify potential arbitrage opportunities:
def identify_arbitrage(df):
odds_columns = ['AKB', 'B3', 'BF', 'BY', 'CE', 'DP', 'EE', 'FB', 'FR', 'G5', 'KN', 'LD', 'LS', 'MA', 'N4', 'OE', 'PP', 'QN', 'S6', 'SI', 'SK', 'SX', 'UN', 'VC', 'VT', 'WA', 'WH']
select_columns = ['Market Name', 'Selection Name', 'Odds to Lay Ratio', 'Best Back Price', 'Best Lay Price', 'Best Lay Size', 'Best Odds', 'Best Bookmaker', 'URL']
# Find the best odds and corresponding bookmaker
df['Best Odds'] = df[odds_columns].max(axis=1)
df['Best Bookmaker'] = df.apply(lambda row: ', '.join([col for col in odds_columns if row[col] == row['Best Odds']]), axis=1)
# Calculate the ratio of Best Odds to Best Lay Price
df['Odds to Lay Ratio'] = df.apply(lambda row: row['Best Odds'] / row['Best Lay Price'] if row['Best Odds'] <= 500 else 0, axis=1)
# Sort the DataFrame by the 'Odds to Lay Ratio' in descending order
df = df.sort_values(by='Odds to Lay Ratio', ascending=False)
# Convert necessary columns to float for calculations
df['Best Odds'] = df['Best Odds'].astype(float)
df['Best Lay Price'] = df['Best Lay Price'].astype(float)
df['Best Lay Size'] = df['Best Lay Size'].astype(float)
# Calculate Lay Liability
df['Lay Liability'] = (df['Best Lay Size'] * (df['Best Lay Price'] - 1)).round(2)
# Calculate Back Amount (B) for hedging
df['Back Amount'] = (df['Best Lay Size'] * df['Best Lay Price'] / (df['Best Odds'])).round(2)
# Calculate Profit
df['Profit If Outcome Happens'] = (df['Back Amount'] * (df['Best Odds'] - 1)) - df['Lay Liability']
df['Profit If Outcome Does Not Happen'] = df['Best Lay Size'] - df['Back Amount']
# Set Profit to the minimum of the two scenarios, rounded to 2 decimal places
df['Profit'] = df[['Profit If Outcome Happens', 'Profit If Outcome Does Not Happen']].min(axis=1).round(2)
# Set Profit to 0 if Best Odds > 500
df.loc[df['Best Odds'] > 500, 'Profit'] = 0
# Filter for potential arbitrage opportunities
arb_opportunities = df[df['Odds to Lay Ratio'] > 1]
# Select and order the columns for the final DataFrame
cols = ['Market Name', 'Selection Name', 'Odds to Lay Ratio', 'Best Odds', 'Best Lay Price', 'Best Lay Size', 'Best Bookmaker',
'Lay Liability', 'Back Amount', 'Profit', 'URL']
return arb_opportunities[cols]
arbitrage_opportunities = identify_arbitrage(odds_df)
Interpreting the Results
- Odds to Lay Ratio: The higher this ratio, the more profitable the potential arbitrage.
- Best Odds vs Best Lay Price: The difference between these indicates the size of the arbitrage opportunity.
- Lay Liability: This shows how much you need available in your Betfair account.
- Back Amount: This is how much you need to stake with the bookmaker.
- Profit: This is the guaranteed profit if you place both the back and lay bets at the indicated stakes.
Challenges and Considerations
When identifying arbitrage opportunities, keep in mind:
- Timing: Odds can change rapidly, so you need to act fast.
- Account Limitations: Bookmakers may limit or close accounts that consistently exploit arbitrage opportunities.
- Liquidity: Ensure there’s enough liquidity on Betfair to place your lay bet at the indicated odds and stake.
- Terms and Conditions: Be aware of any terms that might void bets or change odds.
- Betfair Commission: The profit calculated doesn’t account for Betfair’s commission, which will slightly reduce your actual profit.
- Stake Restrictions: Bookmakers might have maximum stake limits that could prevent you from placing the full required back bet.
Conclusion
This enhanced approach to identifying arbitrage opportunities provides a more comprehensive view of potential profits and required stakes. It allows for a more informed decision-making process when considering arbitrage bets.
Further work can be done in automating the placing of bets on these arbitrage opportunities and also including additional sources of prices, for example www.predictit.org