Shakes Vision: Tutorial

Showing posts with label Tutorial. Show all posts

Saturday, 6 June 2026

Rekhta Reader & Downloader (Web App) – Thousands of Books on Mobile and Desktop

I come bearing good news today. By now you have already guessed it from the title, and I strongly suspect some of you have been waiting impatiently for the day a proper Rekhta tool with an actual user interface would finally arrive.

This post is also available in Urdu here.

What can I say? The things we like for ourselves, we like for our friends as well. And I like sharing.

So, without further ceremony, here it is:

Rekhta Reader & Downloader

A complete tool that allows you to read and download books from Rekhta.org on both desktop and mobile devices.

Let us take a quick tour of the tool.

As soon as you open the application, you will be greeted with an interface similar to this:

Requirements

Before using this tool, you must install a CORS bypass extension in your browser.

Download the extension here. I have included the complete usage instructions again near the end of this article, so if anything feels unclear, you will find a detailed guide there as well.

You can also see the complete workflow in the diagram below. ( Full-size image )

Extension page for reference:

When the extension is active, its logo appears in color. If it turns black, the extension is disabled.

Back to the tool itself.

Paste the URL of a book into the first field and click GO. The book pages will immediately open inside the reader for browsing and reading.

Search Books Directly — No Need to Visit Rekhta Separately

You can search Rekhta's book collection directly from within the application.

As you can see, the search results appear right inside the tool. Simply click the book you want to open and close the dialog window afterwards.

Click on any page image and start reading. This lets you browse a book before committing to downloading the whole thing — which may save some of us from downloading a dozen books simply because they looked interesting.

Clicking the PDF button downloads the entire book.

I will repeat the usage instructions again near the end of this article, though the screenshots above should already make things fairly self-explanatory.

Before that, however, it might be worth explaining how this tool came into existence in the first place.

The Story Behind This Tool (and a Few Dry Technical Details)

A while back, inspired by the work that Falsafi Bhai and Muhammad Umar Bhai had done on Urdu Mehfil, I put together a Node.js (JavaScript) version of the tool. It worked, and I used it regularly, but it came with three rather annoying problems:

It still had to be run from the command line. There was a graphical interface at one point, but it never really worked properly.
It only worked on a computer. So whenever a book was needed, the routine was always the same: open the laptop first, then download the book.
There was no reading facility. One had to download books blindly. And before downloading a dozen books out of sheer curiosity, it is usually nice to know whether a book is actually worth reading in the first place.

The original plan was to package the entire JavaScript library as an NPM package. Partly because it would be useful, and partly because it seemed like a good opportunity to gain some experience in that area as well.

Then, as often happens with side projects, life intervened and the idea quietly slipped out of mind.

I remember mentioning it to Umar Bhai during a private conversation on Mahfil. A few months later the idea resurfaced, so naturally I opened Mahfil again and started digging through the old discussion threads to refresh my memory and reconstruct the project's history.

Eventually I landed on Muhammad Umar Bhai's GitHub profile and discovered that he had already published a JavaScript version of the tool.

Suffice it to say, I sat there feeling slightly defeated.

(For the record, the Windows version of this tool still exists and continues to work perfectly well. The only catch is that it has no graphical user interface. Think of it as a traditional command-line application that you run through Command Prompt. Not particularly difficult, but an interface certainly makes life easier.)

Then fate decided to intervene.

Looking back at that old codebase sparked a new idea: why not port the entire thing to the frontend and make it work directly as a web application? If successful, everything could happen inside the browser without requiring users to install or run anything complicated.

There was, however, one obstacle.

Browser security policies generally do not allow one website to freely interact with another website's resources. In technical terms, the browser gets quite protective and starts throwing what developers lovingly call CORS errors.

To work around that limitation, I considered two possible solutions and built support for both into the web application.

Proxy Backend

The first approach was to use a proxy backend. In simple terms, a backend service fetches the data on your behalf and returns the response to the application. Since the browser sees the request as originating from your own backend, it usually has no reason to object. In theory, this should solve the problem quite neatly. For the sake of accuracy, I should mention that I never fully tested this route in production. My expectation is that it should work, but because the real challenge involves loading images rather than ordinary API responses, there is always the possibility of running into the same CORS restrictions further down the chain. Which brings us to the second and far simpler workaround...
CORS Bypass Extension

This is by far the easiest solution. Browser extensions exist specifically for bypassing CORS restrictions, and several of them are freely available. Install one, enable it when needed, and the application can access the resources it requires. The extension linked in this article is the one I have been using myself.

As inelegant as browser workarounds sometimes feel, they have one undeniable advantage: they save everyone from setting up servers, configuring proxies, maintaining infrastructure, and generally turning a simple reading tool into a full-time engineering project.

In the end, the goal was not to build a monument to software architecture. The goal was much simpler:

Open a book. Read a book. Download a book if you want.

Preferably from a phone while lying comfortably on a sofa.

How to Use the Tool

To use the tool, simply follow the steps below in order.

Install a CORS bypass extension. Download it here .

If you intend to use the tool on Android, make sure you are using a browser that supports extensions. One such browser is Kiwi Browser. Kiwi Browser for Android You can download the APK from the Assets section of the release page, or obtain it from any other trusted source you prefer.
Open the Rekhta Reader web application. Rekhta Reader (https://tools.shakeeb.in/rekhta-reader/)
Enable the extension and start using the web application.

If you run into any issues, feel free to leave a comment below. Suggestions and feedback are equally welcome.

These days I find myself reading most books directly inside the tool, and rarely need to download them anymore. On mobile, however, I still download books occasionally simply because the reading experience is sometimes more comfortable that way.

What Comes Next?

This is not my first Rekhta-related project.

Some time ago I released another tool for Rekhta content in the form of a script and an Android application.

Setting Up OCR for Windows and Linux: A Comprehensive Guide

Automating repetitive tasks like extracting text from images can save valuable time (unless you don't value it, in that case it will save some worthless time). This process, known as Optical Character Recognition (OCR), is a powerful tool for converting text in images into editable formats. Here, I’ll walk you through setting up custom OCR solutions for both Windows (that we all have) and Linux (mostly used in offices) systems, complete with keyboard shortcuts for seamless integration.

I primarily use OCR for Urdu in my personal work, but professionally it is also required for English. Using Google Lens is a fine option, except if you dislike repeating those clicks and key presses just to copy text from an image. And to be honest - I kind of feel bad even for giant corps like Google when I unnecessarily utilize their 'precious' resources.

Why not use a browser extension you ask? Well, because it's limited to browser - and we do need text from other apps as well. You can argue that one can take a screenshot of that app and then go to browser and run OCR, but if you have opened a browser and afford to take a screenshot just for that, why not run Google Lens instead of an extension. You get the point.

Background

OCR technology is invaluable for tasks such as digitizing printed documents, extracting text from screenshots, or processing scanned images. By automating OCR, you can:

Instantly access extracted text.
Improve productivity.
Simplify your workflow.

This guide provides a step-by-step walkthrough for setting up OCR on Windows and Linux, ensuring a smooth and user-friendly experience.

Introduction

Why Automate OCR?

Manual text extraction is time-consuming and error-prone. Automating the process ensures:

Faster access to text data.
Minimal effort for repetitive tasks.
A consistent and reliable workflow.

How It Works

We’ll create scripts for Windows and Linux that:

Capture an image or utilize an existing one.
Perform OCR using Tesseract (an open-source OCR engine).
Copy the extracted text directly to the clipboard.

Setup

Prerequisites

Before getting started, ensure you have the following:

Tesseract OCR
- Download and install from Tesseract’s official page.
- Install necessary language packs (e.g., -l eng for English, -l ara+eng for Arabic and English).
Clipboard Utilities
- Windows: Use nircmd for clipboard operations.
- Linux: Install xclip for clipboard management.
Screenshot Tools
- Windows: Use built-in snipping tools or third-party software.
- Linux: Install flameshot for advanced screenshot functionality.

Procedure

For Windows

1. Create the OCR Script

Create a batch file named sstoocr.bat and save it in a convenient location:

@echo off
:: Save clipboard to image
start nircmd/nircmd.exe clipboard saveimage screenshot.png

:: Run Tesseract OCR on the image
tesseract screenshot.png output -l ara+eng

:: Copy extracted text to clipboard
type output.txt | clip

:: Optionally, clean up
:: del screenshot.png
:: del output.txt

2. Assign a Shortcut

Place the script on your desktop.
Right-click the script and select Create Shortcut.
Right-click the shortcut, go to Properties, and under the Shortcut tab, assign Ctrl + Alt + O as the shortcut key.

3. Use the Script

Copy an image to the clipboard or take a screenshot.
Press Ctrl + Alt + O.
The extracted text will automatically be copied to your clipboard.

For Linux

1. Create the OCR Script

Create a shell script named flameshot_ocr.sh:

#!/bin/bash
flameshot gui --raw | tesseract -l eng stdin stdout | xclip -selection clipboard

Make the script executable:

chmod +x flameshot_ocr.sh

2. Assign a Shortcut

Open your desktop environment’s keyboard settings.
Add a custom shortcut:
- Command: /path/to/flameshot_ocr.sh
- Shortcut: Ctrl + Shift + O

3. Use the Script

Press Ctrl + Shift + O to open the Flameshot GUI.
Select the area to capture.
The text will be extracted and copied to your clipboard.

Conclusion

By following this guide, you can set up a streamlined OCR solution for both Windows and Linux. With a simple keyboard shortcut, you’ll have quick access to extracted text directly on your clipboard, saving time and effort.

Feel free to customize these scripts to better suit your needs. Happy automating, and may your workflows become ever more efficient!

Saturday, 3 October 2020

Magic of Browser Bookmarks - Automate Simple Tasks using JavaScript

As I promised in #LearnedToday, I'm going to show you how much you can achieve with this little bookmark feature in the browsers.

Ever wondered how to easily remove citations from a Wikipedia page?

What are bookmarks?

The bookmarks in the browsers are to save the links to the pages you wish to visit again, or you just find them useful and save them for later.

Instead of creating a text file "Imp Links" and saving all the links there (I've done it a lot), you could use the browser's bookmark feature.

The shortcut to bookmark a webpage in most browsers is ctrl+b.

What more can they do?

To sum up, they can run JavaScript on a page. So instead of opening the browser console to do run a couple lines of code, you could create a bookmark and click that instead.

Example?

Whenever I needed to copy something from Wikipedia, I usually had to deal with the references/citations they have. You must've seen those, with squared brackets around numbers, something like this [1] or with a disclaimer like [citation needed], etc. I needed to remove all those.

Initially, I used to do it in MS Word manually, by Find and Replace. I don't remember that now, doesn't matter anyway.

Finally, I came to know about these browser bookmarklets, and then a simple regex was enough to do the work for me.

Now I have a simple bookmark. I go to any Wikipedia page, select the text I need, and click the bookmark. Viola! Citations are removed.

How to create a bookmarklet?

Got to Bookmarks Manager

1. Click three vertical dots in the upper right corner > Bookmark > Bookmark Manager
Or chrome shortcut: ctrl+shift+o
Or type in the address bar: chrome://bookmarks/
2. Click three vertical dots in the upper right corner of Bookmark manager (Shows tooltip: Organize)
3. Add new bookmark
4. It will show a popup with two fields: Name and URL.
5. Give any appropriate name, and in the URL bar, paste the JavaScript code you want to execute.
6. Click Save.

You have your bookmarklet ready.

Show/Hide Bookmarks bar with ctrl+shift+b. Clicking on the name of your bookmark will run the underlying code.

Any easier way to do this?

If you don't want to go through all those steps, there's a simple tool called Bookmarkleter. Paste your JS code, it will generate a link that you can drag and drop to the bookmarks bar.

For example, drag and drop the following link to your bookmarks bar. This will allow you to change fonts on any website.

Set Font

Which bookmarklets am I using?

Citation Remover: Removes citations from a Wikipedia page. Drag&drop this link to the bookmarks bar: Citation Remover
Set Font: If a website is using bad font, use this. As I use Urdu a lot, and Urdu without Nastaleeq font looks ugly. So I apply any font to the page available in my system. Payami Nastaleeq is the default one for me.
Calci: A tiny calculator which returns results of simple arithmetic operations.
StyleStripper: Strips all CSS styles from a webpage. Helpful if I don't want to load an entire page I want to copy something from. Also works on most of the sites which disable copying using JavaScript. Click StyleStripper and you can copy the text.

Misc. bookmarklets I created

QuoraSkip: Skip Quora-enforced 'login' popup by removing added elements and blur overlay.

To those who requested, don't complain now. (Abuzar :D) I have shared it finally. More such tips will follow. Keep visiting! And I know you will. :wink:

Rab Raakha!

Friday, 2 October 2020

PDF to Single Image - A Tutorial by 17 Year Old Me

Back in the days when I had a small Nokia phone, I wanted to do EVERTHING in that tiny device. It wasn't actually mine but because I was going to college, I was more "in need" of it than my sister.

Nokia-C1-01 Phone I had in my Engineering

Source: gsmarena.com [1]

The one on your right with maroon border. That was it.

Anyway, with a screen of 144x160px, I wanted to read PDFs which were stored in our desktop + laptop. Lots of books, of almost all genres I was interested in. Interestingly enough, the same neatly arranged folders are copied over to every computer I have used. So I still have all those books, plus what was added later on.

Initially, the idea to "read PDF on phone" was for the Quran, so that I could read it in the Indo-Pak Naskh font. Actually I had a Quran app in it, full text with super fast search engine, but the font used in that wasn't good enough for long tilaawat. In fact, even after getting android phone I've been searching something as fast as that app. I had been a fan of that guy who built it. Just looked it up, he goes by the name of Raza Mahi. His "Mahi Dictionary" was awesome too. All java .jar applications are things of the past now, but he has also moved on and started to build the similar apps for Android now. Good for him. I've linked his website in the references. [2]

So where was I? Yes. As I had difficulty reading the Quran in that app, I selected a PDF copy of Quran which had Arabic text in one column and its Urdu translation side-by-side. I cropped-out the translation part (making the text narrow enough to fit on my phone) and then started thinking about a way to achieve the result.

Necessity is the mother of invention they say, so I came up with two methods (discussed in the booklet below). Will attach the Quran files too for the record. Wow! Time flies. Seems like yesterday to me.

Later on when I converted many books to 'single image' using the same method, I compiled a short tutorial in the form of a booklet. I've left the whole text as is, without any correction in grammar or sentence structure, because

It's a reminder of my journey (read the booklet and see for yourself how writing styles change)
It's cute. ;)

Here's the summary of the two methods discussed in the booklet:

Method 1: Microsoft Office OneNote + MS Paint

Method 2: PDF to Images + IrfanView

Read the booklet and know how to use them. And remember it's an OLD tutorial.

DOWNLOADS

PDF to Single Image Tutorial (Booklet) : Read online or download

https://archive.org/details/PDFToSingleImageShakes.Ahmad

IrfanView: I came to know later on that this was very popular image-manipulation tool back then, and still is. Its first release was in June 1996. Now it's more powerful than ever. Check its Wikipedia page.[3]

https://www.irfanview.com

PDF to Images Converter: I still use it. Small size, works smoothly.

https://www.weenysoft.com/free-pdf-to-image-converter.html

Enjoy!

Reference

[1] Specifications of Nokia C1-01 via gsmarena [link]

[2] Raza Mahi Team - Old Apps [link]

[3] IrfanView on Wikipedia [link]

Sunday, 5 April 2020

LTE - Types, Features and Working

WHAT IS IT?

Assuming this is a new term for you and you have no idea what this is, “what on earth does this mean” is the first thing you should ask. Let’s know the full form first. LTE stands for Long-Term Evolution.
Ok. But evolution of what? I don’t know either. According to sources, this naming conventions were a part of advertising the technology and appeal to the customer base. Alright, enough of the intro, let’s know it’s simple explanation borrowed from Wikipedia:
Long-Term Evolution (LTE) is a standard for wireless broadband communication for mobile devices and data terminals.
You still don’t get it, did you? Remember 2G and 3G technologies? This LTE is the next stone in that journey. So the architecture was purely ased on the 3G technology by UMTS. Much of the LTE standard addresses the upgrading of 3G UMTS to what will eventually be 4G.
What’s the major difference between LTE and the third generation (3G)? Well, a large amount of the work is aimed at simplifying the architecture of the system. But is it 4G? We’ll discuss this in the end of this blog. For now, let’s jump to its classification.

TYPES

There are basically 2 mobile data transmission technologies based on 2 major factors, viz:
• How data is uploaded and downloaded
• What frequency spectra the networks are deployed in
So, based on these two factors, we have two types of LTE.
1. Long-Term Evolution Time-Division Duplex (LTE-TDD)
2. Long-Term Evolution Frequency-Division Duplex (LTE-FDD)
Before proceeding with this, let’s know some basics of GSM and CDMA so that you know what these “divisions” are. Afterwards, you’ll be able to digest this easily.

GSM, CDMA and LTE

GSM and CDMA are two different ways to accomplish the two things. LTE is newer.
The way GSM solves (1) is by something called TDMA (time division multiple access). When you're in a phone call, you're phone is scheduled a bunch of time slots when your phone either sends or receives data. These exclusive to your phone and different from other phones in the cell so there's no interference. This way, multiple phones can talk to the cell tower (seemingly) at once (the bursts of time are super short so you don't notice them).
CDMA deals with (1) in a completely different way. It breaks up the channel into codes/signals (Code division random access). This is a little hard to explain without some math, but there's a notion called orthogonality. If two signals are orthogonal you can pull one signal out without getting interference from the other. Every user is assigned a different code/signal and these are (approximately) orthogonal to each other. This is a more advanced technique and generally thought of as advantageous since there isn't as much waste (TDMA needs little bits of extra time between users to make sure there's no overlap, for example).
The way (2) is accomplished is also very different. In fact there are many different ways it is done even within GSM or CDMA. The way data is sent along depends a lot on how good the quality of the radio signal and other factors. That's a whole other thing. But the options for GSM and CDMA differ.
3G and 4G are kind of marketing terms that come from "3rd generation" and "4th generation". They refer to families of standards, but not specific methods to accomplish (1) or (2).
Now you know the basics, let’s get back to types of LTE.

LTE-TDD and LTE-FDD

LTE-TDD Uses a single frequency, alternating between uploading and downloading data through time while LTE-FDD paired frequencies to upload and download data.
Despite the differences in how the two types of LTE handle data transmission., LTE-TDD and LTE-FDD share 90 percent of their core technology. This makes it possible for the same chipsets and networks to use both versions of LTE.
Several companies produce dual-mode chips or mobile devices, including Samsung and Qualcomm.

FEATURES

Peak download rates up to 299.6 Mbit/s and upload rates up to 75.4 Mbit/s
Cost effective
Low data transfer latencies
Lower latencies for handover and connection setup time
Higher network throughput
Improved support for mobility, exemplified by support for terminals moving at up to 350 km/h
Orthogonal frequency-division multiple access for the downlink, Single-carrier FDMA for the uplink to conserve power
Support for inter-operation and co-existence with legacy standards (GSM/GPRS or W-CDMA-based UMTS )
Uplink and downlink Carrier aggregation.
Packet-switched radio interface
It’s because of these features that most carriers supporting GSM networks can be expected to upgrade their networks to LTE at some stage

MADE OF?

What is LET made of? That means it’s working backbone consists of these things, most of which we have already discussed above. For the concepts you might not find familiar, I’ve attached link to resources so that you can have an idea of what they are.
• OFDM (Orthogonal Frequency Division Multiplexing) for Downlink
• SC-FDMA (Single Carrier FDMA) for Uplink
• MIMO (Multiple Input Multiple Output)
• E-UTRAN (for Network)

VOICE CALLS IN LTE

One of the major problems they faced designing LTE was how to handle voice calls using it. LTE was primarily meant for (internet) data transfer, so the transfer of voice data to integrate with telecom operators was an issue.
With the adoption of LTE, carriers had to re-engineer their voice call network. The reason behind this was that the LTE standard supports only packet switching with its all-IP network. On the other hand, voice calls in GSM, UMTS and CDMA2000 are circuit switched.
3 different approaches sprang up to handle this:

1] Voice over LTE (VoLTE)

VoLTE networks support both voice and data at the same time, without hampering the other. Whereas, the traditional LTE networks may or may not support data and voice together, or may affect the quality of the voice call

2] Circuit-Switched Fallback (CSFB)

LTE just provides data services. When voice call is to be made, it will fall back to the circuit-switched domain.
Advantage: Operators can provide services quickly.
Disadvantage: Requires longer call setup delay.

3] Simultaneous Voice and LTE (SVLTE)

Handset works simultaneously in the LTE and circuit switched modes.
LTE mode providing data services and the circuit switched mode providing the voice service. This is a solution solely based on the handset, which does not have special requirements on the network.
Disadvantage: The phone can become expensive with high power consumption.

IS IT 4G?

Now the controversy (not a big one, I know… but still, it is there.)
Contrary to popular belief, LTE at the current stage was not always considered 4G. ITU (International Telecommunication Union) determines what can be considered 4G and they initially had defined all the standards which a technology had to meet. LTE couldn’t meet those requirements.
Therefore, LTE is popularly known as 3.95G.
LTE-Advanced did make the cut through. But the business and telecom operators had allegedly “influenced” the ITU to update their standards so that they can advertise their services as 4G to attract users.
As a result, there is a slight disagreement between the businesspeople and technophiles on definition of 4G. technophiles consider the original ITU guidelines as a standard for 4G.

CONCLUSION

To solve “How to get many people to share a piece of spectrum”, LTE uses OFDMA which increases throughput
Hope you get at least the gist of what’s been explained in this blog. If not, jump over to the pages linked in the article or post comment if you are reading this on ShakesVision.

SHAKEEB AHMAD
April 05, 2020

Energetic Quote

About Me

Blog Archive

Recent comments

Introducing AfsaneDB (Beta) – Now Available on Play Store!

Rekhta Reader & Downloader (Web App) – Thousands of Books on Mobile and Desktop

Rekhta Reader & Downloader

Requirements

Search Books Directly — No Need to Visit Rekhta Separately

The Story Behind This Tool (and a Few Dry Technical Details)

How to Use the Tool

What Comes Next?

Setting Up OCR for Windows and Linux: A Comprehensive Guide

Background

Introduction

Why Automate OCR?

How It Works

Setup

Prerequisites

Procedure

For Windows

1. Create the OCR Script

2. Assign a Shortcut

3. Use the Script

For Linux

1. Create the OCR Script

2. Assign a Shortcut

3. Use the Script

Conclusion

Magic of Browser Bookmarks - Automate Simple Tasks using JavaScript

What are bookmarks?

What more can they do?

Example?

How to create a bookmarklet?

Any easier way to do this?

Which bookmarklets am I using?

Misc. bookmarklets I created

PDF to Single Image - A Tutorial by 17 Year Old Me

DOWNLOADS

LTE - Types, Features and Working

WHAT IS IT?

TYPES

GSM, CDMA and LTE

LTE-TDD and LTE-FDD

FEATURES

MADE OF?

VOICE CALLS IN LTE

1] Voice over LTE (VoLTE)

2] Circuit-Switched Fallback (CSFB)

3] Simultaneous Voice and LTE (SVLTE)

IS IT 4G?

CONCLUSION