My sql query returning me 141 rows and two columns and i want to plot them in a horizontal bar graph. While doing so the data's are overlapping each other. How i can make the graph crisp and clear.
On Y axis it will show the report name and in X axis it'll show their value. How i can achieve that?
import os
import cx_Oracle
import matplotlib.pyplot as plt
import numpy as np
reportid_count = []
count_ID = []
c = conn.cursor()
query = 'select distinct (LTRIM(REGEXP_SUBSTR(ID, '[0-9]{3,}'), '0')) as ReportID,count(ID) from dev_user.RECORD_TABLE group by ID'
c.execute(query)
#loop through the rows fetched and store the records as arrays.
for row in c:
reportid_count.append(row[0])
print(row[0])
count_ID.append(row[1])
print(row[1])
fig = plt.figure(figsize=(13.5, 5))
#plot the bar chart
plt.barh(reportid_count,count_ID)#,color=['red', 'blue', 'purple']
for i, v in enumerate(count_ID):
plt.text(v, i, str(v), color='blue', fontweight='bold')
plt.title('Report_Details')
plt.xlabel('Report Count')
plt.ylabel("Report ID's")
path = r"\\dev_server.com\View\Foldert\uidDocuments\Store_Img"
os.chdir(path)
plt.savefig(path + '\squares.png')
plt.show()
conn.close()
Sample Dataset-
Report 1 1200
Report 2 0
Report 3 0
Report 4 0
Report 5 0
Report 6 0
Report 7 0
Report 8 0
Report 9 0
Report 10 0
Report 11 0
Report 12 0
Report 13 0
Report 14 0
Report 15 0
Report 16 0
Report 17 0
Report 18 0
Report 19 0
Report 20 0
Report 21 0
Report 22 0
Report 23 1
Report 24 2
Report 25 3
Report 26 4
Report 27 5
Report 28 6
Report 29 100
Report 30 101
Report 31 102
Report 32 103
Report 33 104
Report 34 105
Report 35 106
Report 36 107
Report 37 108
Report 38 109
Report 39 110
Report 40 111
Report 41 112
Report 42 113
Report 43 114
Report 44 115
Report 45 116
Report 46 117
Report 47 118
Report 48 119
Report 49 120
Report 50 121
Report 51 122
Report 52 123
Report 53 124
Report 54 125
Report 55 126
Report 56 127
Report 57 128
Report 58 129
Report 59 130
Report 60 131
Report 61 132
Report 62 133
Report 63 134
Report 64 135
Report 65 136
Report 66 137
Report 67 138
Report 68 139
Report 69 140
Report 70 141
Report 71 142
Report 72 143
Report 73 144
Report 74 145
Report 75 146
Report 76 147
Report 77 148
Report 78 149
Report 79 150
Report 80 151
Report 81 152
Report 82 153
Report 83 154
Report 84 155
Report 85 156
Report 86 157
Report 87 158
Report 88 159
Report 89 160
Report 90 161
Report 91 162
Report 92 163
Report 93 164
Report 94 165
Report 95 166
Report 96 167
Report 97 168
Report 98 169
Report 99 170
Report 100 171
Report 101 172
Report 102 173
Report 103 174
Report 104 175
Report 105 176
Report 106 177
Report 107 178
Report 108 179
Report 109 180
Report 110 181
Report 111 182
Report 112 183
Report 113 184
Report 114 185
Report 115 186
Report 116 187
Report 117 188
Report 118 189
Report 119 190
Report 120 191
Report 121 192
Report 122 193
Report 123 194
Report 124 195
Report 125 196
Report 126 197
Report 127 198
Report 128 199
Report 129 200
Report 130 201
Report 131 202
Report 132 203
Report 133 204
Report 134 205
Report 135 206
Report 136 207
Report 137 208
Report 138 209
Report 139 210
Report 140 211
Report 141 212
Report 142 0
Report 143 0
Report 144 0
Report 145 0
Report 146 0
Report 147 0
Report 148 0
Report 149 0
Report 150 0
Report 151 0
Report 152 700
Report 153 701
Report 154 702
Report 155 703
Report 156 704
Report 157 705
Report 158 706
Report 159 707
Report 160 708
Report 161 709
Report 162 710
Report 163 711
Report 164 712
Report 165 713
Report 166 714
Report 167 715
Report 168 716
Report 169 717
Report 170 718
Report 171 719
Report 172 720
Report 173 721
Report 174 722
Report 175 723
Report 176 724
Report 177 725
Report 178 726
Report 179 727
Report 180 728
Report 181 729
Report 182 730
Report 183 731
Report 184 732
Report 185 733
Report 186 734
Report 187 735
Report 188 736
Report 189 737
Report 190 738
Report 191 739
Report 192 740
Report 193 741
Report 194 742
Report 195 743
Report 196 744
Report 197 745
Report 198 746
Report 199 747
Report 200 748
Report 201 749
Report 202 750
Report 203 751
Report 204 752
Report 205 753
Report 206 754
Report 207 755
Report 208 756
Report 209 757
Report 210 758
Report 211 759
Report 212 760
Report 213 761
Report 214 762
Report 215 763
Report 216 764
Report 217 765
Report 218 766
Report 219 767
Report 220 768
Report 221 769
Report 222 770
Report 223 771
Report 224 772
Report 225 773
Report 226 774
Report 227 775
Report 228 776
Report 229 777
Report 230 778
Report 231 779
Report 232 780
Report 233 781
Report 234 782
Report 235 0
Report 236 0
Report 237 1300
Report 238 1400
Answer
To avoid overlapping in your horizontal bar chart and to make the graph more clear, we can use a combination of several techniques:
- Set the figure size: Increase the figure size to give enough space to all labels on the Y-axis.
- Rotate Y-axis labels: If there are many report names, rotating the labels can prevent them from overlapping.
- Sort the bars: Sorting the bars either by count or by report name will make the graph easier to read.
- Add padding between bars: Adjusting the
hspace
between the bars can create some space to prevent them from being too close. - Use log scale if counts vary widely: If the values on the X-axis (counts) vary dramatically, you can use a logarithmic scale to make the smaller bars more visible.
Here's an updated version of your code with these suggestions:
import os
import cx_Oracle
import matplotlib.pyplot as plt
import numpy as np
# Initialize lists for storing report names and counts
reportid_count = []
count_ID = []
# Establish a database connection (ensure `conn` is defined)
c = conn.cursor()
# Query to get report ID counts
query = 'select distinct (LTRIM(REGEXP_SUBSTR(ID, \'[0-9]{3,}\'), \'0\')) as ReportID, count(ID) from dev_user.RECORD_TABLE group by ID'
c.execute(query)
# Loop through the rows and store the data
for row in c:
reportid_count.append(row[0])
count_ID.append(row[1])
# Close the connection
conn.close()
# Sort by count in descending order to prevent overlapping
sorted_indices = np.argsort(count_ID)[::-1] # Sorting in descending order
reportid_count_sorted = np.array(reportid_count)[sorted_indices]
count_ID_sorted = np.array(count_ID)[sorted_indices]
# Create a figure and axes for the plot
fig, ax = plt.subplots(figsize=(13.5, 8))
# Plot the horizontal bar chart
ax.barh(reportid_count_sorted, count_ID_sorted, color='skyblue')
# Add text labels for each bar
for i, v in enumerate(count_ID_sorted):
ax.text(v + 10, i, str(v), color='blue', fontweight='bold')
# Set the title and labels
ax.set_title('Report Details', fontsize=16)
ax.set_xlabel('Report Count', fontsize=12)
ax.set_ylabel('Report ID\'s', fontsize=12)
# Rotate the Y-axis labels to prevent overlap
plt.yticks(rotation=0)
# Optionally, use log scale if the counts vary dramatically
# ax.set_xscale('log')
# Set padding between bars to make the chart less crowded
plt.subplots_adjust(left=0.2, right=0.8, top=0.9, bottom=0.1)
# Save the figure as an image file
path = r"\\dev_server.com\View\Foldert\uidDocuments\Store_Img"
os.chdir(path)
plt.savefig(path + '\\squares.png')
# Display the plot
plt.show()
Key Changes:
-
Sorting the bars:
- I sorted the bars in descending order by the count to make sure the most significant bars are shown at the top.
- This can help reduce overlap since bars with larger counts are visually more prominent.
-
Increasing figure size:
- The figure size has been adjusted to
(13.5, 8)
to provide more space for the chart.
- The figure size has been adjusted to
-
Spacing adjustments:
- Used
plt.subplots_adjust()
to adjust the margins, which can help make the chart less crowded.
- Used
-
Rotation of Y-axis labels:
- Set the Y-axis labels to have a horizontal orientation (
rotation=0
) for better clarity. You can adjust this depending on the length of your report names.
- Set the Y-axis labels to have a horizontal orientation (
-
Adding labels:
- Added labels on the bars for better clarity, with a little padding (
v + 10
) to ensure the labels are visible even if the bar is very short.
- Added labels on the bars for better clarity, with a little padding (
-
Log scale option:
- If your data varies widely (e.g., from 0 to a few thousand), consider using a logarithmic scale (
ax.set_xscale('log')
), which compresses large values and makes smaller ones more visible.
- If your data varies widely (e.g., from 0 to a few thousand), consider using a logarithmic scale (
Output:
This code will produce a horizontal bar chart where:
- The report names (IDs) are listed on the Y-axis.
- The corresponding count for each report is displayed on the X-axis.
- The bars are ordered by count in descending order, ensuring that the graph is not overcrowded.
- The bar labels are displayed next to the bars for clarity.
Let me know if you need further customizations or explanations!