Wednesday, August 17, 2022

what is Hashing and Types of hashing?

 Hashing:   

Hashing is yet another method used for making retrieval faster. It provides direct access to record on the basis of the value of a specific field called the hash_field. Here, when a new record is inserted, it is physically stored at an address which is computed by applying a mathematical function (hash function) to the value of the hash field. Thus, for every new record, hash address=f (hash_field), where f is the hash function.

 Later, when a record is to be retrieved, the same hash function is used to compute the address where the record is stored. Retrievals are faster since a direct access is provided and there is no search involved in the process.

 As example of the typical hash function is given by a numeric hash field, say an id, modulus a very large prime number. As hashing relates a field value to the address of the record, multiple hash fields will map a record to multiple addresses at the same time. Hence, there can be only one hash field per file.

 Example: Consider the example of the Student table. Let Stuid be the hash field and the hash function be defined as ((Stuid mod 10000)*64+1025). The records with Stuid 10001, 10002, 10003 etc. will be stored at addresses 1089, 1153, 1217 etc. respectively. 

Types of hashing: 

There are two types of hashing: 

 i) Static hashing 

 ii) Dynamic hashing

i) Static Hashing has the number of fixed primary pages in the directory. Thus, when a bucket is full, we need an overflow bucket to store any additional records that hash to the full bucket. This can be done with a link to an overflow page, or a linked list of overflow pages. The linked list can be separate for each bucket, or the same for all buckets that overflow. When searching for a record, the original bucket is accessed first, then the overflow buckets. Provided there are many keys that hash to the same bucket, locating a record may require accessing multiple pages on disk, which greatly degrades performance.

ii) Dynamic hashing: The problem of lengthy searching of overflow buckets is solved by Dynamic Hashing. In Dynamic Hashing the size of the directory grows with the number of collisions to accommodate new records and avoid long overflow page chains. Extendible and Linear Hashing are two dynamic hashing techniques.


Indexing: Indexing is another common method for making retrievals faster. Consider the example of CUSTOMER table. The following query is based on Customer's city. “Retrieve the records of all customers who reside in Delhi" Here a sequential search on the CUSTOMER table has to be carried out and all records with the value 'Delhi' in the Cust_City field have to be retrieved. 

The time taken for this operation depends on the number of pages to be accessed. If the records are randomly stored, the page accesses depends on the volume of data. If the records are stored physically together, the number of pages depends on the size of each record alsoIf such queries based on Cust_City field are very frequent in the application, steps can be taken to improve the performance of these queries. Creating an Index on Cust_City is one such method. This results in the scenario as shown below.





A new index file is created. The number of in file is same as that of the data file. The index file has two fields in each record. One field contains the value of the Cust_City field and the second contains a pointer to the actual data record in the CUSTOMER table.


Whenever a query based on Cust_City field occurs, a search is carried out on the Index file. Here, it is to be noted that this search will be much faster than a sequential search in the CUSTOMER table, if the records are stored physically together. This is because of the much smaller size of the index record due to which each page will be able to contain number of records.


When the records with value 'Delhi' in the Cust_City field in the index file are located, the pointer in the second field of the records can be followed to directly retrieve the corresponding CUSTOMER records.


Thus the access involves a Sequential access on the index ne and a Direct access on the actual data file.

Retrieval Speed v/s Update Speed : Though indexes help making retrievals faster, they slow down updates on the table since updates on the base table demand update on the index field as well.


It is possible to create an index with multiple fields i.e., index on field combinations. Multiple indexes can also be created on the same table simultaneously though there may be a limit on the maximum number of indexes that can be created on a table.

 

Tuesday, August 16, 2022

Characteristics and Benefits/Advantages of a Database?

 Characteristics and Benefits/Advantages of a Database:

 There are a number of characteristics that distinguish the database approach from the file-based system or approach. This chapter describes the benefits (and features) of the database system.


 1. Self-describing nature of a database system:

 A database system is referred to as self-describing because it not only contains the database itself, but also metadata which defines and describes the data and relationships between tables in the database. 

This information is used by the DBMS software or database users if needed. This separation of data and information about the data makes a database system totally different from the traditional file-based system in which the data definition is part of the application programs.

 2. Insulation (Cover) between program and data: 

In the file-based system, the structure of the data files is defined in the application programs so if a user wants to change the structure of a file, all the programs that access that file might need to be changed as well. On the other hand, in the database approach, the data structure is stored in the system catalogue and not in the programs. 

Therefore, one change is all that is needed to change the structure of a file. This insulation between the programs and data is also called program-data independence.

 3. Support for multiple views of data: 

A database supports multiple views of data. A view is a subset of the database, which is defined and dedicated for particular users of the system. Multiple users in the system might have different views of the system. Each view might contain only the data of interest to a user or group of users.

 4. Sharing of data and multiuser system:

 Current database systems are designed for multiple users. That is, they allow many users to access the same database at the same time. This access is achieved through features called concurrency control strategies. These strategies ensure that the data accessed are always correct and that data integrity is maintained. 

The design of modern multiuser database systems is a great improvement from those in the past which restricted usage to one person at a time. 


5. Control of data redundancy:

 In the database approach, ideally, each data item is stored in only one place in the database. In some cases, data redundancy still exists to improve system performance, but such redundancy is controlled by application programming and kept to minimum by introducing as little redundancy as possible while designing the database.


 6. Data sharing: 

The integration of all the data, for an organisation, within a database system has many advantages. First, it allows for data sharing among employees and others who have access to the system. Second, it gives users the ability to generate more information from a given amount of data that would be possible without the integration. 

7. Enforcement of integrity constraints: 

Database management systems must provide the ability to define and enforce certain constraints to ensure that users enter valid information and maintain data integrity.

 A database constraint is a restriction or rule that dictates what can be entered or edited in a table such as a postal code using a certain format or adding a valid city in the City field. There are many types of database constraints. 

Data type, for example, determines the sort of data permitted in a field, for example numbers only. Data uniqueness such as the primary key ensures that no duplicates are entered. Constraints can be simple (field based) or complex (programming).


 8. Restriction of unauthorized access

Not all users of a database system will have the same accessing privileges. For example, one user might have read-only access (i.e., the ability to read a file but not make changes), while another might have read and write privileges, which is the ability to both read and modify a file. For this reason, a database management system should provide a security subsystem to create and control different types of user accounts and restrict unauthorized access.

 9. Data independence:

 Another advantage of a database management system is how it allows for data independence. In other words, the system data descriptions or data describing data (metadata) are separated from the application programs. This is possible because changes to the data structure are handled by the database management system and are not embedded in the program itself.

 10. Transaction processing:

A database management system must include concurrency control subsystems. This feature ensures that data remains consistent and valid during transaction processing even if several users update the same information.

 11. Provision for multiple views of data: 

By its very nature, a DBMS permits many users to have access to its database either individually or simultaneously. It is not important for users to be aware of how and where the data they access is stored. 


12. Backup and recovery facilities:

Backup and recovery are methods that allow you to protect your data from loss. The database system provides a separate process, from that of a network backup, for backing up and recovering data. If a hard drive fails and the database stored on the hard drive is not accessible, the only way to recover the database is from a backup.

 If a computer system fails in the middle of a complex update process, the recovery subsystem is responsible for making sure that the database is restored to its original state. These are two more benefits of a database management system

What is Database Engine and Query Processor?

        1. What is Database Engine? 

  •  It is the heart of the DBMS. It stores, updates, and retrieves data. It also increases the speed and scalability of a database.

          2. What is Query Processor? 

  • It is the functional component of DBMS. It’s function is to break down query statement (given by user) into instruction understood by the DBMS. It helps the database system to access and update data.


1. Clustering:

 The method of storing logically related records physically together is called clustering. In this process, if the page containing the requested record is already in the memory, retrieval from the disk is not necessary. In such a situation, time taken for the whole operation will be less. Thus, if records which are frequently used together are placed physically together, more records will be in the same page. 

Hence, the number of pages to be retrieved will be less and this reduces the number of disk accesses which in turn gives a better performance. 

For example: Assume that the customer record size is 128 bytes and the typical size of a page retrieved by the file manager is 1KB (1024 bytes). If there is no clustering, it can be assumed that the customer records are stored at Random physical locations. In the worst case scenario, each record may be placed in a different page.

 Hence, a query to retrieve 100 records with consecutive customer id’s (say 1000, 10002) will require 100 pages to be accessed, which in turn translates to 100 disk accesses. 

 But, if the records are clustered, a page can contain 8 records. Hence, the no. of pages to be accessed for retrieving the 100 consecutive records will be ceil (100/8) =13 i.e. only 13 disk accesses will be required to obtain the query results. Thus, in the above example, clustering improves the speed by a factor of 7.7

  •  There are two types of clustering: 
  1.  Intra-file clustering.
  2.  Inter-file clustering 


  1.  Intra-file clustering: 
  •    When clustered records belong to the same files (tables), this is called intra-file         clustering.

 2. Inter-file clustering:
  • When clustered records belong to different files (tables), this is called inter-file clustering. This type of clustering may be required to enhance the speed of queries retrieving related records from more than one table. Here, interleaving of records is used.

Saturday, August 13, 2022

How many types of DBMS?

 Types of DBMS:

 There are six types of DBMS. They are as follows: 

1. Hierarchical DBMS 

2. Network DBMS 

3. Relational DBMS 

4. Distributed DBMS 

5. Object Oriented DBMS 

6. Object Relational DBMS

 1. Hierarchical DBMS:

 It is called HDBMS because it is based on hierarchical DBMS. This data model was developed by IBM in 1968 and introduced in IMS (Information Management System). This model is like structure of a tree with the records forming the nodes and fields forming the branches of the tree. In this model, a parent record can have several child records but a child can have only one parent record. 



  •  Advantages of HDBMS: 

1. It is very simple, analytical and natural method of implementing record relationships. 

2. This model is useful when there is some hierarchical character in the database. 

3. It is most popularly used data model by everyone because it is very easy to adopt, implement and modify. 

  • Disadvantages of HDBMS: 

1. It cannot represent all the relationships that occurred in the real world.

 2. It cannot demonstrate the overall data model for the enterprise because of the non-availability of actual data at the time of designing the data model.

 3. This data model is used only when there is hierarchical character in the concerned database. It cannot represent many to many relationship.

 4. Insert anomaly/problem: it is not possible to insert data about a new dependent if its superior record is not available.

 5. Delete anomaly: Deleting data in a hierarchical database lets to loss of many information.

 6. Update anomaly: In HDBMS, Updation of record and data is also an another problem. For example, suppose you want to change the address of a student from Delhi to Patna, then you will face two problems: 

a) You need to search the entire database to find every occurrence of that particular student and make changes everywhere. Wherever his/her details is appearing, if you miss out a single occurrence, you will face inconsistency problem. 

b) The student might shown as being in Patna at one place and at Kankarbagh another place.

WHAT IS THE FUNCTION OF DBMS?

 FUNCTIONS OF DBMS: 

The major functions of DBMS are as follows:

 i) Data Definition: The DBMS defines the structure of the data in the application. These include defining and modifying the record structure. The type and size of fields and the various constraints/ conditions to be satisfied by the data in each field.

 ii) Data Manipulation: After defining the data structure data needs to be inserted, modified or deleted. 

iii)Data Security and Integrity: The DBMS contains functions which handle the security and integrity of data in the application. These can be easily invoked by the application and hence, the application programmer need not code these functions in his/her program. 

iv)Data Recovery and Concurrency: Recovery of data after a system failure and concurrent access of records by multiple users are also handled by the DBMS.

 v) Data Dictionary Maintenance: Maintaining data dictionary which contains the data definition of the application is also considered as one of the functions of DBMS. 

vi)Performance: Optimising the performance of the queries is one of the important functions of DBMS. Hence, it has a set of programs forming the query optimiser which evaluates the different implementation of the query and choose the best about them. Thus, the DBMS provides an environment that is both convenient and efficient to use when there is a large volume of data and many transactions to be processed.

Friday, August 12, 2022

History of DBMS or Different People Behind DBMS:-

 History of DBMS or Different People Behind DBMS:-

 As the use of computers in maintaining data increased in late 1960’s the DBTG (Database Task Group) of CODASYL (Conference on Data System Languages) was set up to propose DBMS standards.

 Charles Bachman who was working for the development of the first commercial DBMS IDS (Integrated Data Store) data model (1964 onwards) introduced the earliest diagrammatic techniques for representing relationships database called data structure diagrams.

 The database standards specifications were published by the CODASYL committee, which was referred to as CODASYL DBTG 1971 report. This report contains scheme and subscheme DDL and DML for use with COBOL. A revised report (CODASYL1971) was made in 1978 and other revision was made in 1881.

 A relational model was proposed by E.F.Codd in 1970 in a paper. Relational algebra and theoretical foundations for the relational model were discussed by Codd in his subsequent papers published in 1971, 1972 and 1974. The first hierarchical DBMS-IMS (Information Management System) was developed by IBM in late 1960’s.

 However, there are very few documents available regarding theoretical emergence of hierarchical model. The most popular SQL (Structured Query Language) was described by Boyce etal in 1975. A lot of work has been done to produce the DBMS now in use. Some of the contributors are – Chamberlin, Date etc. ANSI outlined original SQL standard in 1986 which was revised in 1992. 

Users of DBMS:-

 1. DBA (Database Administrator): A database administrator (short form DBA) is a person responsible for the installation, configuration, upgrade, administration, monitoring and maintenance of physical databases.

 The role includes the development and design of database strategies, monitoring and improving database performance and capacity, and planning for future expansion requirements. They may also plan, co-ordinate and implement security measures to safeguard the database.


Role of DBA: 

The person who has control over both programs and data of the system is called Database Administrator (DBA). The functions of DBA include the following: 

i) Schema definition:

The DBA creates the original database schema by writing a set of definitions that is translated by the DDL compiler in a set of tables that are stored permanently in the data dictionary.

 ii) Storage structure and access method definition:

DBA creates appropriate storage structures and access methods by writing a set of definitions which is translated by data storage and data definition language compiler. 

iii) Schema and physical-organisation modification:

 Programmers accomplish the modifications either to schema or physical storage organisation by writing a set of definitions used by DDL or data storage compiler to generate modifications to appropriate internal system.

 iv)Granting data access authorisation: 

The granting of different types of authorisation allows the DBA to regulate which parts of the database various users can access. 

v) Integrity constraint specifications: The data values stored in the database must satisfy certain types of consistency constraints. Example: Account balance cannot fall below Rs.1000/- in case transaction is done by cheque. Such a constraint must be defined explicitly by the DBA. 

2. Database Designers:

 Database designer are responsible for designing the database objects. The database objects like tables, columns, their data types, forms and reports must be properly designed as per the requirement so that those can be flexibly used by the users. The database designers have to frequently communicate with all the prospective database users, understand and study their requirements. They develop the view of database that supports the requirements. Once the database design is complete they assist DBA.

3. Application Programmers: 

Those people who write the codes of the programs according to the designs suggested by the database designers are called application programmers. They also test, debug, document and maintain these programs.

 4. End Users:

Those people who access the database for different purposes are called end users. They are categorised as follows:

 i) Casual end users.

ii) Naive or Parametric end users.

 iii) Sophisticated end users .

iv) Stand alone users.

What is Information technology?

 What is Information technology?

 The methods used to collect and store the data, process the data into information and communicate the information all over the world is Information Technology.  

This technology is a revolution sweeping across the world. It is interesting to know that 75% of all information generated in the entire history of mankind has been generated in the last 30 years.

Thursday, August 11, 2022

.What is the difference between data and information?

 Q.What is the difference between data and information? 

Ans. The major differences between data and information are as follows


What is File Processing System?

 What is File Processing System? 

Ans. The FPS allows permanent records in various files and it needs different application programs to extract records from and add records to the appropriate files. Each time when need arises system programmers write these application programs to meet the needs so, that system acquires more application programs, more files which will be time consuming always. It creates complexity, reduces efficiency of the system. FPS has a number of disadvantages, which are given below:

 1. Data redundancy.

 2. Inconsistency.

 3. Data Isolation.

 4. Accessing problem .

5. Security .

6. Concurrency.

 7. Integrity .

8. Database 



As shown in the figure, in a FPS, different programs in the same application may be interacting with different private data files. There is no system enforcing any standardised control on the organisation and structure of these data files..

What is RDBMS (Relational Database management System)

 Q. What is  RDBMS (Relational Database management System):-

It stands for "Relational Database Management System." An RDBMS is a DBMS designed specifically for relational databases. Therefore, RDBMS is a subset of DBMS.

 A relational database refers to a database that stores data in a structured format, using rows and columns. This makes it easy to locate and access specific values within the database. It is "relational" because the values within each table are related to each other. Tables may also be related to other tables. The relational structure makes it possible to run queries across multiple tables at once. 
                                   While a relational database describes the type of database an RDMBS manages, the RDBMS refers to the database program itself. It is the software that executes queries on the data, including adding, updating, and searching for values. An RDBMS may also provide a visual representation of the data. For example, it may display data in a tables like a spreadsheet, allowing you to view and even edit individual values in the table.
             Some RDMBS programs allow you to create forms that can streamline entering, editing, and deleting data. Most well known DBMS applications fall into the RDBMS category. Examples include Oracle Database, MySQL, Microsoft SQL Server, and IBM DB2. Some of these programs support non-relational databases, but they are primarily used for relational database management. Examples of non-relational databases include Apache HBase, IBM Domino, and Oracle NoSQL Database. These type of databases are managed by other DMBS programs that support NoSQL, which do not fall into the RDBMS category.


Advantages of RDBMS: 
1. It is easy to use.
 2. It is secured in nature. 
3. The data manipulation can be done. 
4. It limits redundancy and replication of the data. 
5. It offers better data integrity.
6. It provides better physical data independence. 
7. It offers logical database independence i.e. data can be viewed in different ways by the       different users.
 8. It provides better backup and recovery procedures. 
 9. It provides multiple interfaces.
10. Multiple users can access the database which is not possible in DBMS. 



Disadvantages of RDBMS: 
 1. Software is expensive.
 2. Complex software refers to expensive hardware and hence increases overall cost to avail    the RDBMS service.
 3. It requires skilled human resources to implement. 
 4. Certain applications are slow in processing. 
 5. It is difficult to recover the lost data.

What is SQ3R TECHNIQUE OF READING:

Q. SQ3R TECHNIQUE OF READING:

This technique of reading is involved by “Robinson” in his book ‘effective study’ 1970. 

SQ3R stand for the initial letter of the five steps in study in text.

S- Survey

Q- Question 

R- Read

R- Recall

R- Review

Survey:

Survey refers to a quick glance through the title page, preface, and chapter headings 

of a text. By surveying the learner will be able to determine the main ideas of the text. 

Besides the author’s name, date and place of publication and title page can give the 

reader an idea of the general subject area. The table of contents, a preface or 

forward in a book would give you an idea of the themes and how they are organised. 

A survey of the index for bibliography tells you immediately whether the book 

contains what you need.

Question:

A survey of the text will surely raise a few questions in your mind regarding the text. 

Some of the question could be:

1. If the book useful or relevant to my study?

2. Does it provide some guidelines/information on the subject at hand?

But as you go through the individual chapters you might get specific questions 

regarding the topic.

After surveying and questioning you begin the actual reading.

Reading:-

You need to develop a critical approach in reading anything for that matter. Read the 

text over and over again and each time with a different question in mind and a 

different purpose in mind. “I read it once and understand everything” kind of attitude

is nothing but a myth. Hence, while reading the first time you just focus on the main points/ideas and supporting details only.

Recall:-

Reading is not an isolated activity. Every reading exercise increases your background knowledge. You should be able to connect the information gained to the already existing background knowledge. Recalling whatever you have read would enable you to connect, relate the content to the previous and future learning of the subject. This leads us to the next stage in reading i.e. review.

Review:-

Reviewing is nothing but checking whether we have follows the earlier stages promptly and efficiently. Whether we have surveyed the book, article, magazine properly. Have we asked the appropriate questions relating to the content, have we read critically and have we recalled the most significant details/information required for our study? These are questions that would like you to ask in the final stage of reading. Review will sharpen your critical faculty and you would be able to form your own opinions on the topic and express them to others.

                                               

                                         


                                                    






What is Network DBMS.

 Q. What is Network DBMS: 


In Network model a parent record can have several child records and a child can also have more than one parent records. Records are physically linked through linked list. In this model, data are represented by records using links among them. It is an improvement over the hierarchical model. In this system we can have many to many relationships among records.

 The network data model is similar to hierarchical model except that an entity can have more than one parent. Integrated database management system and system 2000 are examples of NDBMS. 






Advantages of NDBMS: 
  •  It is useful to represent such records which have many to many relationships.
  •       In this model, the problem of inconsistency doesn’t occur because a data                     element is physically located at just one place.
  •        Searching a record is easy because there are many access path to a data                       element. 


Disadvantages of NDBMS: 
  •              In this model all records are maintained using pointers and hence, the whole database becomes very complex. 
  •           Insertion, Updation and Deletion of any record could require pointer adjustment.





Wednesday, August 10, 2022

what is data?

 Q. What is data?

 Ans. Recording of any meaningful thing in an understandable form is called data. 
 Or
 The facts related to people, places, things and events are called data.
or the things processed in a meaningful way  that are know as data.

 Example:
 Student, Army, Sitamarhi, College, Furniture, Electricals, date of birth, Anniversary etc. Or Data refers to the symbols that represent people, events, things, and ideas. Data can be a name, a number, the colors in a photograph, or the notes in a musical composition.